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SIGNIFICANCE OF INVENTORIES IN THE 
CURRENT ECONOMIC SITUATION 


Louis J. PaARADISo* 
The Econometric Institute, Inc. 


It is the purpose of this paper to: (1) appraise thc current 
inventory level and recent additions in relation to other 
investment items in the economy and to the total value of 
output; and (2) analyze the current inventory levels relative 
to activity by broad industry groups. 


INCE V-J Day, the course of inventory changes has led at times to 
~ optimism and at other times to apprehension regarding the future 
prospects of business. The inventory accumulation which occurred in 
the last half of 1945, for example, was salutary, since it reflected the 
gradual filling of the civilian pipelines which were depleted during the 
war years, and foreshadowed increased production of civilian goods. 
The rapid accumulation of inventories which occurred in the second half 
of 1946, on the other hand, was interpreted by many observers as a 
sign of impending decline in business activity. 

Those who are influenced in their business outlook by the course of 
inventory changes fail to assign to inventories their proper role in the 
business cycle. In theory, the part which inventories play in business 
fluctuations has been extensively developed. But theory is often cast 
aside when inventory accumulation reaches unduly large proportions. 
It is then argued that since inventory additions are contributing signifi- 
cantly to the total value of output and since such a high rate of accumu- 
lation cannot go on indefinitely, that when the inventory accumulation 
stops, business activity must turn down unless the inventory loss is 
offset by rising activities elsewhere in the economy. This reasoning 
overlooks the fact that inventory changes are a symptom and not a 
prime cause of business fluctuations. The inventory position at the 
beginning of a change in the direction of the cycle can moderate or 
accentuate the movement; it can spell profits or loss for an individual 
firm; and it can prolong or shorten the cyclical swing. In this sense, the 





* Paper delivered at the 107th Annual Meeting of The American Statistical Association in New 
York City on December 28, 1948. 
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appraisal of the trends and levels of inventories by kinds of business 
and by types of goods is of vital importance to business men as a guide 
to inventory policy. Inventory trends, however, are not the guides to 
use in forecasting the turning points of a cycle since the historical rec- 
ords indicate that they tend to lag behind a number of other basic eco- 
nomic factors in the timing of the turning points. 


THE MAGNITUDE OF RECENT INVENTORY CHANGES IN THE ECONOMY 


Although inventory accumulation proceeded at a rapid rate soon 
after V-J Day, it was not until the second half of 1946, following the end 
of price controls that the rate of accumulation became so great as to 
cause considerable concern. During the last quarter of 1946, business 
added a physical volume of inventories in the prices current at the time 
amounting to the annual rate of 5.5 billion dollars. In book value, this 
represented an addition of an annual rate of 16 billion dollars. Thus, the 
current value of additions to inventories in that quarter accounted for 
nearly one-fifth of the total gross private domestic investment. Further- 
more, this rate of accumulation meant that nearly 5 per cent of all 
goods (excluding services) produced in the economy during that quarter 
were for additions to existing stocks. 

In ordinary times such a rise would be symptomatic of trouble brew- 
ing elsewhere in the economy, such as in reduced buying power of indi- 
viduals and in distorted price, wage, and profit structures. Analysis of 
the basic economic developments in the past year, however, shows that 
demands for ali types of goods were on the increase, that while distor- 
tions were appearing in the price structure, the problem businessmen 
were facing was to get a balance in inventories and to get enough goods 
of the right quality, style, and color to satisfy the ever-growing demands 
of consumers and producers, both domestic and foreign. The high rate 
of inventory accumulation experienced in the fourth quarter of 1946 
did not continue; nor did this spell recession. As better balance was 
obtained in the inventory holdings, supplies of goods flowing to ulti- 
mate consumers increased and the reduction in the rate of accumulation 
from 5.5 billion dollars in the fourth quarter of 1946 to less than one 
billion dollars in the fourth quarter of 1947, as shown in Table I, was 
offset by rising consumer expenditures, net foreign balance, and pro- 
ducers’ durables. The point commonly overlooked is that even though 
industrial production failed to rise materially during most of 1947, 
more goods were flowing to the ultimate consumer than in 1946 as less 
goods were being channeled into inventory additions. 

Let us examine in some detail the inventory position for the major 
groups of industries—manufacturing, wholesale, and retail trades. 
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METHODS OF ANALYZING INVENTORIES 


The usual method businessmen employ in appraising the inventory 
position is to compare the current stock sales ratio with that in a period 
which would be considered as normal. This might be the average of a 
pre-war period of years of good business, or 1940 or 1941. This method 
has the advantage of simplicity, is easily understood and takes little 
time to develop. It is useful in many cases where data are not available 
over a long period and a more refined analysis cannot be madeand where 
the current ratios are substantially different from the period chosen as 
normal. 














TABLE I 
Annual rates from 
1940 1941 4th Quarter 4th Quarter 
1946 1947 
(Billions of Dollars) 

Change in Inventories: 

Current Valuation 2.3 3.9 5.5 0.6 

Book Value 2.4 7.2 16.0 8.0 
Gross Private Domestic Investment: 13.0 17.2 30.4 29.9 
Gross National Product: 100.5 125.3 218.6 240.9 

Ratios: (Percentages) 

Change in Inventories to Gross Private Investment: 

Current Value 37.7 22.7 18.1 2.0 

Book Value 18.5 41.9 52.6 26.8 
Change in Inventories to Gross National Product: 

Current Value 2.3 3.1 2.5 0.2 

Book Value 2.4 5.7 7.3 3.3 





Source: U. 8. Department of Commerce. 


The stock-sales ratio, however, has two distinct disadvantages and 
on that account can be misleading and result in a wrong and costly 
policy for buyers and merchandisers. First, the stability of the stock- 
sales ratio assumes that a given percentage change in sales is accompa- 
nied by a similar percentage change in stocks. For example, those who 
use the stock-sales ratio reason as follows: if it is assumed that a stock- 
sales ratio of 3 is considered normal in a year such as 1941, then a 
ratio of 2 in 1947, under conditions of substantially greater sales would 
imply thinness of inventories. This conclusion may or may not be the 
case. Further analysis might show that in the particular business a 
change of ten per cent in sales is accompanied by a change of only 5 
per cent in inventories and therefore with greater sales, the stock-sales 
ratio should be lower in order that stocks be just adequate to support 
the larger volume of sales. In other words, in this example, the stock- 
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sales ratio of 2 would be in line with the historical experience of the 
firm and would indicate a balance rather than thinness in stocks. Sec- 
ond, use of the stock-sales ratio assumes that the stocks at any period 
of time must bear a fixed relation to the rate of sales in that period. 
Actually, inventory changes are the resultant of dynamic factors and 
are influenced by changes in activity of previous periods as well as long- 
term changes in turnover and merchandising policies of business firms. 

A more fruitful and logical technique for analyzing inventories is 
that of developing the relationship of inventories to the basic economic 
factors affecting their fluctuations. This is the approach used in this 
paper. 

THE DATA 

Overall inventory data are not available on a monthly basis before 
1939. Year-end data therefore must be relied upon for the earlier years. 
Even so, the existing data are not complete for all years. Census infor- 
mation is available for some years; data on department store stocks for 
all years back to 1919; the Bureau of Internal Revenue is the source 
for value of inventories held by corporations. In this report, the esti- 
mates made by Simon Kuznets and the National Bureau of Economic 
Research have been relied on primarily for the data before 1939. 
These estimates have been spliced to the Department of Commerce 
data beginning with December 31, 1938. The benchmark used by the 
Department was the Census of Manufactures and Census of Business 
for 1939 which contain data on the value of inventories at the beginning 
and the end of 1939. Since 1939 monthly data have been compiled by the 
Department on the basis of a representative sample of firms covering 
manufacturers, wholesalers and retailers. 

In general, the values of inventories analyzed are book value figures 
which reflect not only the current value of physical additions but also 
revaluations resulting from price changes. In view of the fact that price 
changes also affect sales or the value of output, the book value is the 
more appropriate measure to relate to the value of output or sales. 

It is difficult to appraise the reliability of existing data on inventories. 
Book value figures reflect the various practices of firms which are not 
uniform from one industry to another or even within an industry. 
Various tests have indicated that, in general, existing sources tend to 
provide less reliable data on the levels of inventories but are much more 
accurate with regard to the movements or changes. Since the analysis 
presented in this report is concerned with the development of historical 
relationships involving two or more variables, the accuracy of the level 
of inventories is not as important as the reliability of the size and direc- 
tion of the inventory change. 
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MANUFACTURERS’ INVENTORIES IN RELATION TO PRODUCTION 


It has been found that the value of manufacturers’ inventories is de- 
pendent not only on the value of output during a given period but also 
on the value of output in a previous period and on a gradual trend of the 
residuals of inventories after eliminating the influence of changes in 
the value of output. 

Using the data for the years 1920 to 1941, the regression of invento- 
ries on value of products is as follows: 


(1) v5 =3.45—.0122t+.0661v.+.06330,, 


where 
v;=average value of manufacturers’ inventories (arithmetic average 
of values at end of current and previous years), 
t=2(year—1931)+1, 
v= value of manufactured products in current year, 
v, =value of manufactured products in previous year. 
The values are in billions of dollars. 


Table II and Chart I compare the actual inventories with the values 
calculated from the value of product according to regression (1). The 
fit is obviously very good. For the period 1920-41 the average percent- 
age errors (ratio of the residuals to the calculated values) is only 2.5 
per cent; with a maximum error of 8 per cent in 1920; the multiple cor- 
relation coefficient is 97 per cent.! 

The fact that the inventory changes can be estimated very closely 
from changes in the value of output in the previous and current year 
provides a standard or a gauge for appraising the current inventory 
position relative to the value of production. 

The formula implies the following three results based on the expe- 
rience of the pre-war period, 1920-1941: 

1. Everything else being equal, the value of manufacturers’ 
inventories declined on the average by 24 million dollars per year. 
This has resulted from a steady improvement in the flow of mate- 
rials and from increased efficiency and better planning on the part 
of management, particularly the gradual introduction of inventory 
control systems whereby the inventory units are geared to the 
rate of production. 

2. Other factors being equal, an increase (or decrease) of 1 bil- 
lion dollars in the value of output in the current year has been asso- 





1 If the value of output of the previous year is excluded from the analysis, the multiple correlation 
coefficient is 91 per cent. 
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ciated with a rise (or decline) of 660 million dollars in the average 
value of inventories. 
3. Assuming the same current year value of output and no 


TABLE II 


Value of Inventories of Manufacturing Industries 








Calculated | Per Cent of Residual 











Total Non-durables Inventories to Calculated 
Value Average Value Average 
of Inven- of Inven- Total§ a wor q Total ion 
Output* _toriest Outputt toriest a a 
Billions of Dollars 

1920 69.5 11.2 33.5 5.8 12.1 6.1 — 8.0 — 4.9 
1921 41.6 10.3 26.6 5.2 10.8 5.4 — 4.8 -— 3.9 

1922 50.0 9.8 28.0 5.1 9.6 5.2 2.0 0 
1923 58.2 11.0 32.9 5.7 10.6 5.6 3.7 1.8 
1924 53.5 11.5 30.5 5.8 10.8 5.6 6.7 3.6 
1925 60.8 11.6 35.1 5.9 11.0 5.8 5.7 Ly 
1926 62.5 11.8 36.0 6.2 11.5 6.1 2.5 1.6 

1927 60.3 11.5 35.8 6.1 11.5 6.1 0 0 
1928 61.2 11.3 36.5 6.2 11.4 6.1 0 1.6 
1929 68.0 py 38.4 6.4 11.8 6.3 - 9 1.6 
1930 51.0 11.2 32.5 6.0 33.1 5.9 9 pe i 

1931 39.8 9.5 26.0 5.1 9.3 5.1 3.3 0 

1932 28.5 7.6 17.5 4.2 7.8 4.2 — 2.6 0 
1933 30.6 7.2 21.0 4.0 73 4.1 0 — 2.4 
1934 37.0 7.8 24.0 4.5 7.8 4.4 0 2.3 
1935 45.0 8.2 28.1 4.7 8.6 4.8 — 4.9 — 2.1 
1936 53.0 9.2 31.0 5.1 9.7 5.2 — 5.4 -— 1.9 
1937 60.7 10.6 34.8 5.7 10.6 5.3 0 7.5 
1938 47.0 10.5 31.5 5.6 10.2 5.5 2.9 1.8 

1939 56.8 10.2 33.6 5.5 10.0 5.5 2.0 0 

1940 66.0 11.2 36.4 5.8 11.2 5.8 0 0 
1941 93.5 13.5 47.4 6.5 13.6 6.7 — 0.7 — 3.0 
1942 121.2 17.0 56.4 7.8 37.2 7.9 - 0.6 - 1.3 
1943 148.5 17.6 62.3 7.9 20.6 8.7 —17.0 — 9.2 
1944 156.3 17.3 66.9 7.8 22.8 9.3 —31.8 —15.1 
1945 140.2 16.4 69.1 8.1 22.3 9.7 —36.0 —16.5 
1946 125.7 18.3 74.1 9.3 20.2 10.1 10.4 — 6.9 
1947 168.8 22.1 95.2 10.8 22.2 11.9 0.5 — 9.2 

















* Source: 1919-1939, census years from Dept. of Commerce. 1939-1947 from Dept. of Commerce- 
For inter-census years interpolated, see Appendix—Table 1. 

t Source: 1929-1947 from Dept. of Commerce. 1919-1929 from National Bu7eau of Economic 
Research. 

t See Appendix—Table 2. 

§ Calculated from Average Value of Manufacturing Inventories =3.45 —.0122t +.0661 (value of 
current year in billions of dollars) +.0633 (Value of Product previous year in billions of dollars) where 
t =2 (year—1931) +1. 

{ Calculated from Average Value of Non-durable Goods Inventories = 1.67 —.00986t +-.0756 (cur- 
rent year value of product billions of dollars) +.0460 (preceding year value of product billions of dol- 
lars), where t =2 (year—1931) +1. 

Note: 1939-1947 the value of output is represented by the value of shipments which differs only 
by small amount from the former. 
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change in other factors, if the value of output in the previous year 

had been 1 billion dollars greater (or smailer), the average inven- 

tories in the current year would have been 630 million dollars 
greater (or smaller). 

The fact that the value of output of the previous year is about equal 
in importance to that of the current year in its influence on inventories 
implies that the current level of inventories is determined approximately 
by the rate of the value of output six months earlier. This amount of 
lag in inventory relative to sales is applicable to manufacturing indus- 
tries as a group and undoubtedly varies industry by industry and com- 
pany by company. 

What is the current position of manufacturers’ inventories relative 
to the pre-war relation of inventories to production? On the basis of 
the 1947 value of output of 169 billion dollars and that of 1946 of 126 
billion dollars, the average value of inventories in 1947 is calculated to 
be 22.2 billion dollars according to formula (1). The actual average for 
the year is estimated at 22.1 billion dollars, or about equal to the amount 
warranted by the value of output. Thus manufacturers’ inventories 
were in line in relation to the pre-war stocks-production relation during 


CHART I 





24 24 





22 


VALUE OF MANUFACTURERS’ INVENTORIES 
ACTUAL AND CALCULATED 


| 
~. 

iad 

Nn 





20 


[| 








Dollars 
a 





Billions of Doliers 


Billions of 
3 
yn 
£2) 
= 
s 
8 
eet? Ge 
r) 





° 
? -T 


. ‘ 


y 
to , N 
4 \ \ 4 
Calculated N / } 
BY 
Le 8 






































368 AMERICAN STATISTICAL ASSOCIATION JOURNAL SEPTEMBER 1948 


1947.2 The rise in the value of these inventories of about 500 million 
dollars which occurred during the final quarter of 1947 did not bring 
stocks out of line with the value of products since the latter also in- 
creased from the previous quarter. From September to October, for 
example, the value of products increased by 8 per cent while the value 
of inventories rose by less than 1.5 per cent. Thus the value of manu- 
facturers’ inventories at the close of 1947 were in balance on the basis 
of the pre-war stocks-output relations. This condition, however, varies 
by industries within the manufacturing group. 

When manufacturers’ inventories are analyzed by the two major 
classifications of industries—durable and non-durable goods—it is 
found that in the case of non-durable goods the value of inventories in 
1947 was deficient relative to the value of product to the extent of over 
1 billion dollars. This is derived from the pre-war relationship of the 
value of inventories of the non-durable goods industries to value of 
products of the current year and the preceding year. The formula de- 
rived from the data for the period 1920-1941 is as follows for non-dur- 
able goods manufacturing industries: 


(2) vin = 1.67 —.00986t+ .0756v.+.046v,, 
where the symbols have the same meanings as in equation (1), except 
that v;, refers to non-durable goods. 


Where t=2 (year—1931)-+1—average value of non-durable goods 
inventories equals arithmetic average of end of current year inventories 
with end of preceding year. 

As Table II and Chart II show, the formula approximates the actual 
value of inventories very closely, the average error for the period 1920- 
1941 being 2 per cent, while the index of determination is 98 per cent.* 

The fact that inventories are currently not up to the levels warranted 
by production indicates (1) that supplies of certain goods were hard to 
get and stocks could not be built up to required levels and (2) that man- 
ufacturers of non-durables deliberately kept their inventories at a mini- 
mum in a period when sales were at record levels. 

In contrast to the non-durable goods industries, inventories of the 
durable goods group are currently over one billion dollars above the 
amount indicated by the pre-war relationship of inventories to value of 
product of these industries. This is implied from the results obtained 





? It may be noted that in 1940 when inventories were exactly in line with the relationship to value 
of products as given by formula (1), the stock-sales ratio was .170 whereas in 1947 the ratio was .131. 
On this basis it would appear that inventories in 1947 were not up to the 1940 position relative to value 
of products. 

3 If the value of output of the previous year is excluded from the analysis the index of correlation 
is 95 per cent. 
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by using formulas (1) and (2). At first sight, this would seem to indicate 
some trouble for these industries. Actually, the reason for the “exces- 
sive” inventories in these industries is that there is considerable un- 
balance in the type of inventory holdings so that many companies hold 
large stocks of certain parts and compcnents while they lack sufficient 
supplies of other goods needed to achieve an expanding volume of out- 
put. This means that the flow of materials has been uneven and part 
of the final output was held down because of inability to get a balanced 
supply of materials going into production. This situation has persisted 
ever since V-J Day and has been a major factor in limiting the flow 
of much needed durable goods to the ultimate consumers. 


WHOLESALE INVENTORIES 


Inventories held by wholesalers are much smaller than those held by 
retailers or manufacturers in relation to the volume of business done. 
The ratio of wholesale inventories to wholesale sales in October 1947 
was .45; for retail, it was 1.07; and for manufacturing, it was 1.49. 
Furthermore, the change in inventories is not so great in wholesale for 
a given change in business as it is in manufacturing or retail. In other 
words, wholesalers have a more rapid turnover and can get along with 
relatively smaller stocks. 
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As in the case of manufacturing, the value of wholesale inventories 
is related to the value of current year’s sales, previous year’s sales and 
a time factor. 

The specific formula for the period 1930-1941 is as follows: 


(3) viv = 1.04 —.032¢+ .0222s,+-.02726s,, 
where 

Viw = average value of wholesale inventories (average of values at the 

beginning and end of the current year), 
t=2 (year—1936) +1, 
8,==current year’s sales, 

8» = previous year’s sales, 

and all money figures are in billions of dollars. 


This relationship shows that in fact the level of the previous year’s 
sales has a greater weight than that of the current year. As Table III 
and Chart III show, the average percentage error for the 12 year period 
is 1.8 per cent; the index of correlation is .999.4 The 1947 average value 
of wholesale inventories estimated at 6.5 billion dollars was 0.9 billion 
dollars lower than the amount calculated on the basis of the relationship 
to sales. Thus in contrast to manufacturing, it appears that the value 
of inventories held by wholesalers is below the economically warranted 
levels assuming the continuation of pre-war relationships. 


RETAIL INVENTORIES 


Of especial interest to businessmen is the appraisal of retail inven- 
tories since retailers are the first to feel the impact of changing con- 
sumer buying, price resistance, and shifting tastes. Many are concerned 
that retail inventories are rising at uncomfortably rapid rates and that 
such a development might spell danger in the near future. 

At this point it is well to distinguish between the appraisal of retail 
inventories in the aggregate as against the inventory holdings of par- 
ticular items by individual firms. In the former case, the impact of ex- 
cessive or deficient inventories is on the economy as a whole, while in 
the latter case it affects management’s policy decisions to purchase, 
order, or liquidate goods. In this analysis, we are concerned with the 
broader aspects of inventories on an industry-wide basis. This will pro- 
vide retailers with the facts on the over-all inventory position of their 
industry. It also provides a technique and method which each retailer 
can apply to analyze the inventory-sales problem for his particular 
commodities. 





4 If the value of output of the previous year is excluded from the analysis the index of correlation 
is .93. 
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The average value of retail inventories during a year is related to re- 
tail sales of the previous year. In addition, after allowing for the level 
of sales, the value of retail inventories has shown a declining trend in 
the period 1920-1941. Specifically, the regression equation relating 
inventories to sales for the period 1920-1941 is as follows: 


(4) vi, =6.5 —.638t+.815s,+.507s,, 
where 

vi, = average value of retail inventories (averages of values at the end 

of the current and previous years), 
t=2 (year—1931)+1, 

8 =retail sales in the current year, 

8, =retail sales in the previous year. 
Values and sales are in billions of dollars. 


where t= 2(year—1931)+1; average value of inventories is the arith- 
metic average of end of current year and end of previous year in- 
ventories. 
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TABLE III 








TRADE INVENTORIES 














Retail Trade Wholesale* Calculated Per Cent of Residual 
Inventories to Calculated 
Value of Value of 7 
Sales* Inven- Sales* Inven- Retailt — Retail Whole- 
’ A sale § sale 
tories f toriest 
Billions of Dollars 
1920 43.0 7.45 7.38 9 
1921 33.8 6.78 6.80 - .3 
1922 34.3 6.04 6.25 - 1.8 
1923 39.3 6.48 6.55 - 1.1 
1924 39.4 6.83 6.69 ai 
1925 42.9 6.97 6.85 1.8 
1926 45.7 7.19 7.13 8 
1927 44.8 7.06 7.07 — .l 
1928 46.0 6.96 6.99 — .4 
1929 48.5 7.15 7.13 3 
1930 42.0 6.61 53.7 4.48 6.60 4.41 -2 3.7 
1931 34.8 5.40 40.6 3.64 5.55 3.70 — 2.7 -— 1.8 
1932 25.0 4.29 30.8 2.95 4.26 3.06 oe - 3.5 
1933 24.5 3.81 30.0 2.74 3.60 2.71 5.8 1.2 
1934 28.7 3.91 33.4 2.84 3.79 2.70 3.2 5.2 
1935 32.8 4.09 42.8 2.87 4.20 2.94 — 2.6 — 2.5 
1936 38.3 4.50 52.2 3.26 4.74 3.34 - 5.1 — 2.5 
1937 42.2 4.97 57.6 3.72 5.20 3.65 — 4.4 1.8 
1938 38.1 5.06 50.0 3.56 4.94 3.56 — 2.4 0 
1939 42.0 5.00 55.3 3.43 4.92 3.41 1.6 0.5 
1940 46.4 5.34 61.8 3.64 5.35 3.64 —- .2 0 
1941 55.5 6.29 83.6 4.21 6.19 4.23 1.6 0.5 
1942 57.6 6.94 93.2 4.34 6.69 4.98 Be -—12.9 
1943 63.7 6.69 99.3 3.98 7.37 5.31 6.7 —25.0 
1944 69.5 6.44 103.4 3.98 7.82 5.50 —17.6 —27.6 
1945 76.6 6.36 105.4 4.14 8.57 5.60 —25.8 —26.1 
1946 96.8 7.83 131.6 §.11 10.45 6.17 —25.1 -17.2 
1947 109.5 10.09 158.4 6.50 12.38 7.42 —18.5 —12.4 

















* Source: Department of Commerce 1929-1947. 1919-1929 based on interpolation with consumers, 
expenditures data on goods, department store sales, etc. 

t Source: 1929-1947 from Dept. of Commerce. 1919-1929 from National Bureau of Economic 
Research. 

t Calculated from Average Value of Retail Inventories =6.5 —.638¢ +815 (retail sales current year 
bills. of $) +.507 (retail sales previous year bills. of $), where ¢=2 (year—1931) +1. 

§ Calculated from average value of Wholesale Inventories =1.04 —.032¢+.0222 (current year 
sales bills. of $) +.02726 (previous year’s sales bills. of $), where t =2 (year—1936) +1. 


This relationship shows that the level of current year retail sales 
has over one and a half times as much weight as the retail sales of the 
previous year in its effect on inventories. As Table III and Chart IV 
show, the correlation is very tight for the period with an average an- 
nual error of 2 per cent for the 22-year period, and a maximum residual 
of only 6 per cent in 1933. The index of correlation is .99.5 





5 If the value of output is excluded from the analysis the index of correlation is 95 per cent. 
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CHART IV 








RETAIL INVENTORIES 
2 ACTUAL AND CALCULATED 

















Billions of Dollars 
o 
Me, 
i 
@ 
Billioas of Dollors 








" y 
CALCULATED \ Vd 
































5 —— 5 
3 ——" 
F 
4, 
tf 
4 
$ So ‘ 
—— 
3 1 1 1 : rn a 1 — — — —" 4 n 1 3 
1920 1925 1930 1935 1940 1945 1947 


In sharp contrast to the inventory position in manufacturing, retail 
inventories currently are significantly below the amount calculated 
from sales on the basis of formula (4). The average value of retail in- 
ventories in 1947 was 10 billion dollars whereas on the basis of 1947 
retail sales of 109.5 billion dollars and of 1946 sales of 96.8 billion dol- 
lars, formula (4) yields a calculated value of inventories of 12.4 billion 
dollars. This means that retailers today are operating with a sig- 
nificantly smaller inventory than is indicated by the pre-war relation 
of sales to inventories. There are three reasons for this: first, many 
goods are still short in supply and retailers cannot accumulate suf- 
ficient quantities to bring stocks up to the pre-war relation of stocks 
to sales. Examples of relative shortages in the consumer durable goods 
field are of course well-known. To some extent this situation also pre- 
vails in certain soft goods. One of the large department stores in New 
York, for example, found recently that of the customers who failed to 
buy anything at all, a significant proportion wanted goods which the 
store did not have on hand. In other words, if certain goods had been 
more plentiful, sales of many retail stores would have been greater. 
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Second, many retaiiers have been pursuing a deliberate policy of hold- 
ing inventories down to a minimum. They are uncertain of the future 
because they have never experienced such high levels of sales and they 
do not want to be caught with high-priced stocks when competition 
begins in earnest. Third, the economy since V-J Day has been operating 
under conditions of relatively full employment with a concomitant high 
level of retail sales. A characteristic feature of this period was the short- 
ages of many goods relative to demand. Competition has only been a 
factor in recent months. Since retailers could sell practically anything 
they had, the turnover rate has been greater than in the pre-war 
period and they did not find it necessary to accumulate inventories to 
conform with pre-war stock-sales relation in order to achieve a high 
rate of sales. 


DEPARTMENT STORE STOCKS 


The data on inventories by kinds of retail business are not sufficiently 
reliable over a long enough period to permit a correlation analysis with 
sales. However, analysis of the sales-inventory relation for depart- 
ment stores is probably typical of other major types of non-durable 
goods stores. 
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APPENDIX—TABLE 1 








Method for Interpolating Value of Products in Manufacturing 











(1) (2) (3) (4) 
FRB BLS Index of 
Value of Manufacturing Prices Other Manufacturers’ 
Products Production Than Farmer Products 

Year (Bills. Dols.) (1935-39 = 100) 

1919 60.2 72 161.6 116.4 
1920 69.5 74 190.1 140.7 
1921 41.6 56 122.9 68.8 
1922 50.0 74 119.5 88.4 
1923 58.2 86 123.9 106.6 
1924 53.5 81 119.2 96.6 
1925 60.8 90 124.5 112.0 
1926 62.5 95 122.8 116.7 
1927 60.3 94 116.2 109.2 
1928 61.2 99 116.4 115.2 
1929 68.0 116 114.6 126.1 
1930 51.0 90 105.5 95.0 
1931 39.8 74 $1.6 67.8 
1932 28.5 57 83.9 47.8 
1933 30.6 68 84.7 57.6 
1934 37.0 74 94.4 69.9 
1935 45.0 87 98.5 85.7 
1936 53.0 104 99.1 103.1 
1937 60.7 113 105.8 119.6 
1938 47.0 87 99.0 86.1 
1939 56.8 109 97.6 106.4 

















Note: Value of Products in odd years from Census; even years interpolated by regression of value 
of manufacturer’s products of Column 4 and census value. 


The value of inventories held by department stores is related to the 
current_year sales of department stores and sales of the previous year 
according to the following regression developed for the period 1920- 
1941: 


(5) via= 1.4 —.993¢+.8864s,+.2539s,, 
where 
vig=average value of inventories of department stores relative to 
1935-39 = 100, 
t=2 (year-1931)+1, 
s,= department stores’ sales in the current year (1935-39 = 100), 
s,=department stores’ sales in the previous year (1935-39 = 100). 


Here again, as Chart V shows, the degree of correspondence is very 
close with an annual average error for the entire period of 2 per cent 





6 The data on inventories and sales are based on published indexes of the Board of Governors of 
the Federal Reserve System in various bulletins of that agency. 
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and an index of correlation of 97.3 per cent. On the basis of an index of 
department store sales average of 281 for 1947 and 264 for 1946 (1935- 
39= 100), the index of the value of department store inventories is 
calculated to be 285. This compares with the estimated actual index of 
inventories of 253. Thus, currently, department store stocks are low 
when appraised in terms of the pre-war stocks-sales relationship. Chief 
factors in this situation are the deliberate policy of department store 
executives to hold inventories down to a minimum and inability to get 
adequate supplies of many goods. 


APPENDIX—TABLE 2 








Method for Interpolating the Value of Products in Manufacturing 
Non-durable Goods Industries 








(1) (2) (3) (4) 
Wholesale 
Year Production of Prices in Value of Value of 
Non-durable Non-durables Non-durable Products 
Goods 1939 =100 .| Manufacturers 

1919 62 186.6 153 36.3 
1920 60 218.0 131 33.5 
1921 57 139.4 79 26.6 
1922 67 132.4 89 28.0 
1923 72 135.7 98 32.9 
1924 69 132.3 91 30.5 
1925 76 140.7 107 35.1 
1926 79 137.9 109 36.0 
1927 83 128.0 106 35.8 
1928 85 129.3 110 36.5 
1929 93 126.0 117 38.4 
1930 84 114.7 97 32.5 
1931 79 97.4 77 26.0 
1932 70 86.1 60 18.5 
1983 79 87.4 69 21.0 
1934 81 98.7 80 24.0 
1935 90 105.1 95 28.1 
1936 100 106.1 106 31.0 
1937 106 112.7 119 34.8 
1938 5 101.9 97 31.5 
1939 109 100.0 109 33.6 

















Column 1: Federal Reserve Board. 

Column 2: Obtained by combining BLS components of non-durable goods prices by BLS weights. 

Column 3: Product of Column 1 and 2. 

Column 4: Odd years—Census of Manufactures; even years—interpolated by a regression be- 
tween Column 3 and census values for the odd years. 
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MEASURING PHYSICAL INVENTORIES* 


Weir M. Brown 
Board of Governors, Federal Reserve System 


SUMMARY 


This paper seeks to examine some of the problems encoun- 
tered in measuring and interpreting the physical volume of 
inventories and to suggest some methods of approaching these 
problems. Public attention to the subject of inventory changes 
certainly has not been lacking in the period since VJ-Day. In- 
deed, the growth im inventories has to some persons appeared 
so significant that the period is sometimes characterized 
simply—probably too simply—as an ‘inventory boom.” 
This paper will not attempt to describe the course of rece_t 
inventory fluctuation, aside from a few isolated illustrative re- 
marks in passing. Rather, this opportunity will be taken to 
scrutinize the fundamental paths of approach to a field in 
which existing knowledge must still be regarded, by any sober 
standard, as limited. 


HE TERM “inventories,” as ordinarily used, is applied to the stocks 
‘a finished goods held by business firms for sale and to stocks of 
raw or semi-finished materials held either for sale or for processing into 
marketable goods of another form. Thus in customary usage the term 
refers principally to stocks in the hands of manufacturers and distribu- 
tors engaged in making and selling tangible goods in the market. It is 
not always perceived that inventories in actuality are held also by 
service and extractive industries, government organizations, and indi- 
vidual consumers, and that conceptually the idea of inventory is uni- 
versal. In this paper, the term “inventory” is employed in a broad sense 
to refer to any stock of unconsumed goods, in whatsoever position in 
the economy it may be held. 

If we imagine for a moment a hypothetical economy which is per- 
fectly static—in which industrial techniques and consumer tastes are 
fixed, the volume of employment and production constant, and with 
even seasonal variations spirited out of existence—it is obvious that in 
such a state the amounts and kinds of goods held in inventory likewise 
would have become perfectly constant, having adjusted themselves to 
the technical requirements of an unchanging level of consumption of 
raw materials on the one hand and a fixed production and consumption 
of finished products on the other. 


Z 





* Paper delivered at the Annual Meeting of the American Statistical Association, New York City, 
December 28, 1947. 
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But in the real world, the quantities of goods held in stock in various 
sectors of the economy are not constant but fluctuate. That is the pri- 
mary reason why inventories hold any interest for us or present any 
analytical challenge. An increasing demand for the products of one 
industry which results in expanded output will ordinarily operate to 
increase the work-in-process stocks of the industry and may also affect, 
in one direction or the other, the size of the materials and finished prod- 
ucts inventories as well, although the short-run effects (on, say, finished 
goods stocks) may be quite different from the long-period effects. Of 
equal importance with changes in demand are varying supply condi- 
tions, whether of a periodic or occasional nature. When the market 
changes represent broad forces affecting the general state of demand or 
conditions of supply, the shifts imposed on the composition and size 
of inventory holdings may be very marked and complicated. In turn, 
the line of influence may flow in the other direction, for inventory 
levels and movements therein may themselves bear importantly on the 
economic situation. 

When we concentrate on the change aspect, it is perhaps easier to 
elucidate the meaning of inventories and to understand their relevance 
to the functioning of the system generally. The stock of any good in 
store at the beginning of a given period constitutes that quantity which 
was produced in preceding periods but not consumed. It is a capital 
good (although not necessarily a producer’s good, of course). It is 
analogous in many respects with other kinds of physical assets, both as 
it relates to the individual holder and in relation to the economy as a 
whole. A transfer of inventory between two individual firms is an act 
of investment for the buyer and one of disinvestment for the seller. For 
the system as a whole, an increase in the total amount of goods held in 
inventory by all firms constitutes a positive investment, and a cor- 
responding diminution of total stocks of goods represents a consump- 
tion of capital or disinvestment. This is a point to which we shall need 
to return later. 


MEASUREMENT CONSIDERED WITH REFERENCE TO PURPOSE 


The breadth or narrowness with which inventories are defined de- 
pends upon the purpose in view. Moreover, the purpose of the investi- 
gation also determines whether stocks will be measured in terms of 
physical quantities or dollar values, and it influences decisions regard- 
ing the exact manner in which any necessary combining of individual 
series will be accomplished. It was pointed out above that in customary 
usage the emphasis in discussions of inventories is ordinarily upon 
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so-called “business inventories,” that is, the holdings of manufacturing 
plants and wholesale and retail distributors. There are several respect- 
able reasons in partial justification for this emphasis. For one thing, 
current statistical data are more readily available for these groups than 
for some others, although this circumstance is adventitious rather than 
inherent. A second reason is that the stocks of some other groups which 
are important quantitatively—such as those of family consumers or of 
public utility companies—are not being held for sale to others and hence 
do not “hang over the market,” in the same sense as the stocks of manu- 
facturers or retailers. For some purposes, it may be permissible to ex- 
clude such holdings by groups other than manufacturers and dis- 
tributors from one’s definition of total stocks; or, if they are not so 
explicitly excluded, to accept with complacency the present unavail- 
ability of the necessary current statistics. Yet for other purposes—and 
these include the task of analyzing at any time the current business 
situation—the results will be only partial and probably will be mislead- 
ing if stocks in all positions are not or cannot be adequately considered. 
Although it is true that some stocks are not being held for sale and 
hence do not manifest themselves prominently in direct transactions, 
such stocks nevertheless do “hang over the market” in a very real 
sense. The size of these holdings and changes in their magnitudes are 
important determinants governing the behavior of the holders, influ- 
encing not only the strength of their demand for the commodity in 
question but also conditioning their productive efforts. Manufacturing 
and trade inventories are undoubtedly large, and attention had been 
concentrated on them even before Adam Smith used the term “the 
profits of stock” in describing the activities of the 18th century mer- 
chant capitalists. But stocks of unconsumed goods are widely held in 
many sectors of the economy. An addition to or a withdrawal from 
stocks represents the amount by which the consumption of a given 
good differs from the amount produced during the period. There can 
be no satisfactory measurement of consumption unless adequate in- 
formation can be developed concerning movements in stocks held in 
all positions within whatever sector of the economy is under examina- 
tion. 

For some purposes a useful, though seldom used, compilation of 
inventory data can be made on a commodity basis, tracing the stock 
holdings of a given commodity or group of commodities through all 
stages of the productive, distributive, and consumption process. This 
type of presentation has applications both in the study of particular 
industries or commodity groups and in describing the inventory posi- 
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tion of larger sections of the economy. There are instances in which, 
for the purpose in hand, it does not matter appreciably at what stage 
in the economic process stocks of specified finished goods are heid, and 
all such finished stocks can be considered as a total, or at least consid- 
ered together. This might be the case, for example, if the problem were 
to ascertain, for certain consumer goods in short supply, what quantity 
could be available for consumption under conditions of strictly-exer- 
cised government controls. Under other circumstances, however, the 
manner in which a given total amount of inventory were divided 
between manufacturers, retailers, and other holders might make a sub- 
stantial difference on the price developments to be anticipated or on the 
measures required to make a given amount of the commodity available 
for export. Also, it is scarcely necessary to say that at times when major 
movements are taking place in prices and the volume of business, the 
manner in which inventories are divided among holders will help to 
determine which groups in the economic structure will reap profits or 
sustain losses. 

Chart 1 gives specific illustration to this fact that inventory holdings 
of the same commodity in different positions may differ considerably, 
both as to magnitude at any one date and as to movements over time. 
Stocks in the hands of two groups—manufacturers and wholesale dis- 
tributors (including chain store warehouses)—are presented for three 
different manufactured food items: evaporated milk, canned peaches, 
and total canned fruits and vegetables. The series charted are index 
numbers representing the physical volume of end-of-month or end-of- 
quarter stocks during the period 1942-1947, with the 1943 average for 
each series taken as 100. The several series exhibit different relation- 
ships not only in the comparative month-to-month movements of the 
two groups of holdings but also in their current levels relative to the 
base year. Since no current data are available on the stocks of these 
commodities held by retailers, family households, or exporters, the 
chart does not give a complete account of total stocks in all hands. It 
does suggest the necessity of caution in interpreting changes in the 
quantities held in any one position or in drawing inferences from changes 
in the ratio of stocks to production (or sales) at any one stage. 

Another method of classifying or grouping inventories the meaning 
of which needs to be closely examined with reference to any given pur- 
pose is the segregation of stocks according to “stage of fabrication.” 
This is the familiar three-fold classification of raw materials, goods in 
process, and finished goods. This method of division is a valid one when 
confined to the various holdings of a particular firm, since such a segre- 
gation can be made unambiguous (or virtually so) and since it serves 
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the operational needs of the management. The same method may some- 
times be applied to an entire industry, although greater care must be 
exercised. But if we should desire to classify the total inventories of all 
manufacturing industry according to stage of fabrication, it is doubtful 
if our purpose would be accomplished by adding up in one column all 
the goods classed as “finished” by their producers, in another column 
all the goods individually reported as “raw materials,” etc. Such a proc- 
ess would have the anomalous result of including in the tally of the 
nation’s stocks of finished goods a substantial quantity of wood pulp, 
sulfuric acid, pig iron, cotton yarn, and other very primary products 
which, from the standpoint of the economy as a whole, are scarcely 
finished. It would also place another large portion of the wood pulp, 
pig iron, and similar stocks in the raw materials category, since they 
are so regarded by the paper and rayon manufacturers and steel mills 
which hold them for further processing. An alternative approach to the 
question of measuring the degree of fabrication of manufacturing stocks 
which may prove more fruitful, though admittedly more difficult, is 
that of classifying inventories (measured in either physical or value 
terms) according to whether they are entirely finished products, inter- 
mediate goods, or primary materials with reference to the manufactur- 
ing process as a whole. 


INVENTORY MEASUREMENT PROBLEMS 


We turn now to a more detailed discussion of the measurement prob- 
lems encountered in dealing with inventories, with special reference to 
the task of reckoning physical volume. It will be apparent from what 
was said earlier in analyzing the nature of inventories that they are 
additive in character, and to a greater degree than some other economic 
variables, such as the value of product. This characteristic derives from 
the fact that inventories are pertinent to a point in time rather than to 
a period of time. A stock of goods is a fund concept rather than a flow 
concept; it is a quantity rather than a rate. Since the same stock cannot 
be found in more than one point on the same date, there are no problems 
of overlapping or duplication (apart from errors in collecting data) 
encountered in adding up inventories. This is not to suggest, of course, 
that economic variables which embody flow concepts (i.e., so much 
quantity per unit of time) cannot be added or that rates cannot be 
averaged. The problems are greater, however, due to the familiar fact 
that during any given period a good may pass through more than one 
manufacturing or distribution stage, and at the end of each stage it 
will appear as a “product.” 

Despite this fact that inventories do possess the quality of being addi- 
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tive conceptualiy, there are nevertheless important problems to be 
faced in any actual process of summation. These difficulties vary from 
one situation to another, depending upon the kind of commodity groups 
involved, upon the extent of the economic area or areas over which com- 
bined measurement is sought, and upon the kind of terms (value or 
quantity) in which the primary data are expressed. Some of the more 
significant issues can be divided into two broad groups, those arising 
in the process of dealing with value figures and those arising in the use 
of quantity figures. Both the physical-volume approach and the dollar- 
value approach to inventory measurement are large subjects, and in a 
paper on physical inventory measurement our discussion of value data 
must be confined largely to the usability of such data in deriving indi- 
cators of quantity. First, however, I should like to allude briefly to 
certain features characteristic of inventory value figures in general 
which compromise their meaningfulness when aggregated, whether in 
their original form or in some deflated form. 

A fundamental consideration is that figures on inventory values are 
creatures of corporate cost accounting. As such, they can probably be 
regarded as having been designed to fit the operational needs of the 
management—or at least as conforming to whatever practices have 
grown up and become habitual in the company. But there are several 
limitations attaching to the collecting and adding together for purposes 
of economic analysis of inventory figures which represent accounting 
book records developed for such different purposes by firms utilizing 
heterogeneous accounting methods. I have reference here principally to 
the limitations ar‘sing out of the great lack of uniformity among com- 
panies in the accounting treatment accorded various types of inventory. 
These characteristics of the original data complicate and equivocate 
any aggregate measures of the total value of inventories. 

But a much more difficult task of measurement and interpretation 
is encountered if one attempts to employ any given series comprising 
the total dollar value of inventories as source material from which to 
derive an indication of movements in the physical quantity of stocks. 
It is apparent that inventory value figures, like any other value aggre- 
gates, are compounds of two types of element—a quantity element and 
a valuation or pecuniary element—and that changes in the value of 
inventories may be brought about by changes in either component. 
Because of the fact that the value of inventories does embody a pe- 
cuniary element and is known to respond, to some degree and with 
more or less flexibility, to changes in the prices at which goods are ex- 
changed in the market, it is often thought that inventories reported in 
value terms should be deflated by some generally-known price index in 
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an endeavor to obtain a measure of quantity changes apart from the 
influence of price movements. But the pecuniary elements which are 
embedded in accounting records of inventory value are not comparable 
with available price index series. In fact, these pecuniary elements do 
not constitute price data at all. Consequently, an attempted physical- 
volume determination through a deflation of an inventory value index 
confronts the obstacles (a) that corporate inventory records usually 
do not carry stocks of finished goods at current market selling prices 
but at one or another of several different cost figures having various 
time references; (b) that a significant portion of total inventories in 
some sectors of the economy represents intermediate products or goods 
in process which seldom if ever enter the market; (c) that many firms 
follow a practice of making certain adjustments in one or more of their 
inventory figures at the end of an accounting period to reflect changes in 
replacement costs; (d) that it is difficult to achieve comparable com- 
modity coverage, even for finished goods, between the price index and 
the inventory value index; and (e) that, even wien the commodities 
covered conform closely, the weighting systems employed in available 
price indexes take account of relative importance of various goods sold 
rather than relative importance of different goods in inventory. Because 
of these difficulties one could hardly expect a process of deflating in- 
ventory value data to yield a satisfactory measure of movements in the 
physical magnitude of stocks. 

In a recent investigation of physical stocks in the hands of food 
manufacturers,! the deflation procedure was also tried to see how a se- 
ries showing the value of inventory in terms of constant prices would 
compare, in this field, with an index of the physical quantity of stocks 
measured directly. The differences in behavior of the two types of index 
were marked. The deflated value series showed much less regularity of 
seasonal movement within itself than did the physical index; the short- 
term movements of the two series did not correspond closely; and there 
were significant differences in the postwar levels of the two indexes 
relative to the base period. Although the dissimilarity was partly at- 
tributable to differences in coverage and differences between the weight- 
ing system employed in the price deflator and that of the inventory 
series, a substantial part of the dissimilarity was believed explainable 
by the difficulties inherent in attempting to relate the cost-accounting 
valuations represented in inventory book-value data to changes in 
market prices. 





1 Described in an unpublished manuscript, which has been circulated within the Federal Reserve 
System in mimeographed form, “Changes in Physical Stocks Held by Food Manufacturers” (December 
1947). 
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DIREC? MEASUREMENT OF PHYSICAL INVENTORIES 


In view of the dubious meaning of figures arrived at through a defla- 
tion of inventory value series, it seems desirable to approach the task 
of ascertaining the physical volume of inventories directly through the 
use of data expressed in quantity terms. This is no novel idea, of course. 
Physical inventory statistics undoubtedly would be more widely em- 
ployed if they were more readily available. Some of the problems en- 
countered in developing such measures should be described. 

First, as to individual series. Current series on inventories in quan- 
tity terms are regularly collected for numerous commodities. There is 
cause for gratitude in the fact that the number of individual commodi- 
ties covered by current series has slowly increased. Most of the series 
emanate from government agencies or trade associations, and ordinar- 
ily they relate to items which are the major products of or principal 
materials consumed in a given industry or trade group. The available 
stocks series ordinarily cover a very high proportion or the entire 
quantity of such stocks held in particular positions. 

After these encouraging facts have been enumerated, it still must be 
acknowledged that the series presently available on commodity stocks 
are inadequate in several respects. (1) Although there are many stocks 
series on manufacturing industries, a few for agriculture and whole- 
saling , and a scattering for other groups such as importers and Federal 
government agencies, the coverage of industries is by no means uni- 
form. Even within the manufacturing sector of the economy, where 
coverage is greatest, there are wide differences in the amount of physi- 
cal stocks data available. If we consider the manufacturing field ac- 
cording to Census industry groups, we find reasonably satisfactory 
current data for foods, textile-mill products, paper and pulp, rubber, 
petroleum and coal products, and the stone, clay, and glass group. On 
the other hand, there are no continuous stock series whatsoever for 
the motor-vehicle, transportation equipment, or furniture groups and 
virtually none for machinery. The remaining industries fall somewhere 
between these extremes. (2) A second point of inadequacy is that cover- 
age is not uniform as to type of commodity. More series are available 
on raw materials or on finished products of a primary sort than on 
highly advanced or diversified finished products. There are very few 
series on stock holdings of wrapping and packaging materials or of fuel 
held by consuming plants. Fortunately, the proportion of total inven- 
tories left uncovered by these circumstances is smaller than the abso- 
lute number. (3) Outside of the manufacturing sector, series on physical 
stocks are even less numerous. In general, data on physical stocks are 
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fewer than inventory value data in almost every major sector of the 
economy. The comparison is especially unfavorable in retail trade. 
Exceptions to this.generalization are found in agriculture and in a few 
other cases. It is noteworthy that there have even been some attempts 
to gather information on consumer stocks, though there are few con- 
tinuous series in this important field. 

Apart from the question of obtaining individual series for particular 
commodities, there are the further problems of how individual series 
may be combined and how inventory movements in broad areas of the 
economy may be represented. Some tentative recommendations on 
these matters can be advanced, based on the investigation we have 
made of stocks in the hands of food manufacturers, although methods 
employed in that field might require more or less alteration if applied 
elsewhere. First of all, where the series to be combined refer to many 
heterogeneous commodities, group measures cannot be obtained by 
direct summation, and any index number will have to be an average 
of relatives. In most cases, some system of explicit weights will appear 
desirable, in order to accord greater influence in the total index to move- 
ments in stocks of major importance than to movements in those of 
minor importance. In constructing the index of stocks held by food 
manufacturers, it was decided that the best criterion of “importance” 
for this purpose was provided by the value of inventory of the respec- 
tive commodities in 1939 as recorded by the Census of Manufactures. 
The use of weights derived from inventory value data means that inso- 
far as the “q” elements embodied in the value figures correspond with 
the “q’s” in the series themselves, we are in effect weighting with 1939 
prices. In making use of Census of Manufactures data for deriving 
weights, two other technical problems arise which can only be men- 
tioned in passing. One is the task of deriving from Census inventory 
value figures collected on an industry basis estimates of inventory val- 
ues for particular commodities within the industry, and the other is 
the adjustment required by the fact that the Census figures represent 
year-end values rather than averages for the year. 

An important fact to be observed is that in measuring stocks by this 
method many items are left “unrepresented,” i.e., any representation 
is only by the index as a whole and not by parts. The decision to use 
this method is dictated by the difficulty of inferring movements in the 
inventory of one commodity from movements in stocks of some other 
commodity for which data are avaiiable. In measuring production, 
there are certain well-known relationships which can be relied upon 
with more or less security in statistical measurement. For example, 
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certain goods are made as “joint products” of the same manufacturing 
process. If a series of figures is available on the slaughter of cattle, for 
instance, it can be made to serve as a measure not only of packing-house 
production of beef but also of cattle hide production. Sometimes the 
consumption of materials is used to represent the activity involved in 
processing these materials into finished products. There is also a rela- 
tionship, though of a different sort, between the output of a raw mater- 
ial and that of a finished product in which it is a major ingredient. But 
relationships between the stocks of two different goods are much less 
predictable than output relationships. Even in the instance of jointly- 
produced goods like beef and hides, there can be no assurance that if the 
packer’s stocks of hides are growing his stocks of beef are likewise in- 
creasing. Demand factors may be such as to move stocks of the joint 
product in the opposite direction or to leave them unchanged. There 
are analogous difficulties in dealing with raw material stocks. 

If one of the available series included in an index of stocks were to 
be accepted as measuring not only the fluctuations of that one com- 
modity but also variations in some other material or product, the im- 
portance of the unrepresented items could be assigned, by means of 
“imputed weights,” to the included series. But the decision to impute 
weights in this fashion would involve the assumption that fluctuations 
in an available series would faithfully reflect movements in the unrep- 
resented stocks whose weights it carried. In food manufacturing, the 
available statistical material has not yielded empirically any reliable 
relationships on the strength of which a system of indirect representa- 
tion could be justified ; moreover, there are reasons a priori, as mentioned 
in the paragraph above, for believing that in some cases, at least, no 
reliable relationship exists. 

For these reasons, it appeared best practice in computing the index 
of food manufacturers’ stocks to let any given series represent only the 
stocks of the commodities covered directly; the weights of other items 
were not imputed to the series of represented items. When this method 
is followed, the selection of items for inclusion in an index is based on 
their measurability and the resulting index is representative of move- 
ments not of all the relevant stocks but only of those for which quantity 
data are included in the index. In the case of the food index, further 
investigation is required to establish the validity of using this index as 
a measure of total food stocks in manufacturing. However, the infer- 
ence should not be drawn that these uncovered items would actually 
have been “represented”—in the sense that the index would be more 
faithfully reflective of movements of all food inventories—if weights 
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attributed to them had been assigned to some one or more series of 
represented stocks. The imputation of weights, the process of nominat- 
ing one statistical series to double or act as proxy for another, has many 
legitimate uses, but it is invalid unless the relationship between the 
two is so well established that a substitute can be relied upon to behave 
in a manner similar to, or predictably different from, that of the absent 
member. Although this has not been found to be the case in the work 
here described, there is no implication that the same lack of consistent 
relationship would necessarily be found in stock data in other com- 
modity fields. 
CONCLUSIONS 


Any conclusions to which this examination may lead should be re- 
garded as tentative. However, I should like, in review, to emphasize 
certain points. Just as there can be no single, all-purpose index of prices, 
there is no single measure of inventories—nor even a single method of 
measurement—which can serve all purposes. We need to be critical in 
scrutinizing any index or other compilation of inventory data to make 
sure of its meaning or relevance to the problem in hand. Both the dollar- 
value approach and the physical-volume approach have an important 
place in economic analysis; but they do not always have the same 
place. Speaking generally, inventory value figures are more appropriate 
for use in conjunction with corporate financial-statement data and with 
some other types of data expressed in dollar values. Physical stocks 
figures are more appropriate for use in connection with production 
measures and certain other data reckoned in quantity terms. Because 
of fundamental differences in their nature, data on book values often 
cannot be made to yield fully satisfactory indications of changes in the 
physical quantities of stocks. Physical stocks held in some parts of the 
economy can be directly measured with a fair degree of reliability from 
present data, but before indexes representative of very broad areas of 
the economy can be constructed there must be a substantial increase 
in the number and coverage of basic series. Since the concept of inven- 
tory is considerably broader, in my opinion, than current usage would 
indicate, there is a corresponding need to extend the investigation of 
stocks to commodities and to economic groups—notably family house- 
holds—hitherto largely ignored. 

The further study of inventories should also be extended in two other 
directions. Detailed study is needed of the relationships between the 
size of inventories in a particular industry and its rate of activity. In- 
vestigation to date seems to point to several preliminary conclusions. 
(1) The ratio of stocks to sales (or production) is not constant at all 
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levels of activity. (2) Different classes of inventory within the same 
industry exhibit different movements in relation to activity. Often 
there is a fairly close association between the amount of goods in proc- 
ess and the rate of operations; but there are many possible relation- 
ships between changes in activity and movements in stocks of mate- 
rials or finished goods. (3) The relationship between inventory holdings 
and sales is influenced by the general state of activity as well as by the 
level of activity in the industry in question. Stocks are less likely to 
rise pari passu with increases in activity when the economy is operating 
close to full employment than at lower levels. (4) The degree of corre- 
spondence depends not only upon the level from which a given move- 
ment in activity or change in stocks commences but also upon its rapid- 
ity. (Some of these relationships can be seen by a careful reading of 
Chart 2, which shows comparative monthly movements of production 
and stocks of an important group of foods during the period since 1935. 
The correspondence between the growth of output and of inventories 
of these goods which had characterized most-of the period through 1940 
disappeared after that time, and the high levels of food production 
achieved during the war and subsequent to it were accomplished with- 
out a commensurate rise in stocks held at the manufacturing level.) 
Another matter requiring further study is the effect on the economy 
as a whole of various amounts of inventory change. In the imaginary 
static society earlier referred to, the levels of inventories were constant; 
there was never a so-called “net change in inventories”; or, rather the 
net change was always zero. In the actual world, although large net 
changes in inventories are ordinarily associated with periods of signifi- 
cant shifts in the general economic situation, it would be dangerous to 
treat a zero net change in inventories as necessarily signifying the exist- 
ence of stable conditions. The relationships between the general level 
of activity and the amount of inventory change need to be investigated 
in terms of gross changes as well as net. 
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ON THE DETERMINATION OF SAMPLE SIZES 
IN DESIGNING EXPERIMENTS* 


Mariuyn Harris,t D. G. Horvitz, anp A. M. Moop 
Iowa State College 


Methods are developed for determining sample sizes re- 
quired to estimate means of normal populations with given 
precision, or to give significance when means differ by a speci- 
fied amount. The methods assume that an estimate of the vari- 
ance has been obtained from a previous or preliminary experi- 
ment. A simple device is suggested for using a priori informa- 
tion about the variance when a formal estimate is not avail- 
able. 


1. INTRODUCTION 


TATISTICIANS frequently encounter the problem of determining the 
S sample size required to obtain a given result, for example, to esti- 
mate a mean by a confidence interval no longer than a specified amount. 
or to show a significant difference between two means if the difference 
is greater than a specified amount. The problem cannot be solved if 
nothing is known of the population distribution, but this is a rather 
unusual situation. Experimental workers generally know, for example, 
whether their populations are sensibly normal or not, and if not, what 
transformations will make them so. But this knowledge (of the form of 
the population distribution) is still not enough to make the problem 
determinate. It is necessary also to know something about the order of 
magnitude of the standard deviation. 

The experimental worker ordinarily has information of greater or 
less precision about the magnitude of the variance. He may have a 
rather precise estimate based on past experiments with the same popu- 
lation, or he may have only a guess. Usually his information is between 
these extremes, being based on past experiments with similar but not 
identical populations. We are concerned with finding a method for using 
information of this kind in planning an experimental investigation. 
However, we shall first consider the more special situation in which 
there exists an estimate of the variance with known precision. Then in 
the final section of the paper we shall return to the more common situa- 
tion in which the estimate of the variance is of dubious precision. The 
technique described there is admittedly crude, but it does provide a 
method for dealing with an important practical problem. 





* Most of this work was done under contract N7onr371 with the Office of Naval Research. 
t Now at Hunter College. 
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In the following sections we shall deal with the problem of determir 

ing sample sizes when: 
1. The population is normal, and 
2. An estimate s,? of the variance is available. 

Thus we assume at the outset that there is definite information about 
the variance in the form of an estimate obtained from a previous experi- 
ment with the same population. The methods to be presented are ap- 
propriate when an experiment is to be repeated once but there is no 
thought of successive repetition as is the case, for example, in quality 
control work. We are concerned here with the circumstance in which a 
research worker has turned up a result which, though not statistically 
significant, would be important if it were reproduced in a more defini- 
tive experiment. Thus a person trying various combinations of materi- 
als in an attempt to increase the life of a certain manufactured product 
may find one which gives an estimated mean life much greater than 
that of the standard product. The confidence interval for the new mean 
however, is too broad to give positive evidence of the superiority of the 
new combination of materials, and it is desired to estimate the mean 
with greater precision in a second experiment, or to test whether the 
mean for the new combination exceeds that for the standard by a spec- 
ified amount in a second experiment. 


2. SAMPLE SIZE FOR A SPECIFIED CONFIDENCE INTERVAL 


Let us suppose that an experimenter wishes to estimate the mean of a 
normal population with a .99 confidence interval no longer than four 
units, and thai he has an estimate, s;?=9, of the population variance 
with m=8 degrees of freedom. Let us further suppose that he can be 
induced to state his problem in the following manner: 

I have an estimate, nine, of the variance of a normal popula- 
tion with eight degrees of freedom. I wish to draw a sample 
sufficiently large that the probability will be .95 that the 
.99 confidence interval will have length less than four units. 
It is impossible to guarantee that the confidence interval will be no 
longer than a specified amount, but one can guarantee it with prob- 
ability .95 (or any other given probability). 

This problem is readily solved with existing tables. Let n+1 be the 
required sample size so that the second estimate s,? of the population 
variance will have n degrees of freedom. The general relation which 
determines n (see appendix) is: 


(d/s;)? = t1-a(n)Fi_s(n, m)/(n + 1) (1) 
where the symbols have these meanings: a is the confidence coefficient 
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of the confidence interval, 8 is the probability that the specified length 
will not be exceeded, d is half the specified maximum length of the 
interval, tia(n) is the 1—a critical level of Student’s ¢ for n degrees 
of freedom, and F\_,(n, m) is the 1—8 critical level of the variance 
ratio for n and m degrees of freedom. 

The relation cannot be solved explicitly for n. Probably as good a 
method as any for finding n is to determine it by trial and error. One 
might, for example, compute a value V for the right side of (1) using a 
trial value of n. If V turns out to be larger than (d/s,)?, a larger value of 
n would be used in the next trial; otherwise a smaller value of n would 
be tried. In the specific example above where s,=3, m=8, a=.99, 
B=.95, d=2, we may try n=40, for example, to get 


V = (7.31)(3.05)/41 = .54, 


which is larger than (d/s;)?=.444, hence a larger value of n is needed. 
After a few trials n is found to be 48 so that the required sample size 
is 49. 

Critical values of the variance ratio for B=.95 and .99 are tabulated 
in almost every statistics book; for other values of 8 one may consult 
references 1 and 2 at the end of the paper. The value of # in (1) is con- 
veniently tabulated in the F tables since Fi_2(1, n) =¢7_,(n). 

We may observe that the result, n+1=49, is considerably larger 
than what one would have obtained by employing this commonly 
used argument: The population standard deviation is about 3, hence 
the mean of a sample of size n+1 will have a standard error of 
3/Vn+1. For a 99% confidence interval ¢,, will be 2.8 or 2.9, hence the 
half confidence interval will have a length of about (2.9)(3)//n+1. 
On putting this equal to two, one finds n+1=19. This argument es- 
sentially puts F = 1 in equation (1) and corresponds roughly to putting 
6=0.5; that is, with n+1=19 one has about a 50-50 chance that the 
resulting confidence interval will have a length smaller than four. 


3. TESTS OF SIGNIFICANCE OF THE MEAN 


The problem of testing the significance of a mean is closely related 
to the problem of confidence interval estimation of the mean. Let us 
suppose the particular problem considered in section two is changed 
to read as follows: 

I have an estimate, s;?=9, of the population variance with 
m=8 degrees of freedom. I wish to draw a sample large 
enough that the probability will be 8=.80 that I will get 
significance at the y=.05 level in testing whether the mean 
exceeds 10 if it actually exceeds 10 by as much as a=1.5. 
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TABLE I 


VALUES OF k=a/s: FOR 8 =.80 
vy =.05 FOR ONE-TAILED TESTS, .10 FOR TWO-TAILED TESTS 











m 
1 2 3 4 5 6 8 12 16 24 32 © 
n 

1 13.8 8.52 7.39 6.93 6.68 6.51 6.31 6.13 6.04 5.96 5.92 5.79 
2 5.88 3.51 3.02 2.81 2.70 2.62 2.53 2.45 2.41 2.37 2.35 2.30 
3 4.30 2.55 2.20 2.03 1.96 1.91 1.85 1.78 1.75 1.72 1.70 1.65 
4 3.55 2.10 1.80 1.67 1.60 1.56 1.50 1.45 1.43 1.40 1.39 1.36 
5 3.12 1.85 1.58 1.47 1.41 1.37 1.32 1.28 1.25 1.23 1.22 1.18 
6 2.81 1.66 1.43 1.32 1.27 1.23 1.19 1.15 1.13 1.11 1.10 1.07 
7 2.56 1.52 1.30 1.21 1.16 1.12 1.08 1.05 1.03 1.02 1.01 979 
8 2.37 1.41 1.21 1.12 1.07 1.04 1.00 .972 .956 .940 .932 910 
9 2.23 1.32 1.14 1.05 1.01 978 4.944 .913 .898 .883 .875 854 
10 2.11 1.25 1.07 993 .952 .925 .893 .863 .849 .835 .828 805 
12 1.92 1.14 .975 .902 .865 .840 .811 .784 .771 .758 .752 732 
14 1.77 1.05 .899 .831 .797 .775 .748 ..723 .710 .699 .693 676 
16 1.65 976 ©.838 «6.775 «4.743 «6.722.697 «w 673 «62S. 46 631 
18 1.56 921 .790 .731 «6.701 «=—.681 =. 658 =. 635.624 614.609 594 
20 1.48 873 .750 .693 .665 .646 .624 .602 .592 .583 .578 563 
25 1.32 779 «6669 «6619-593. «6577 s«w557 «538 = 6529) 520.515 502 
30 1.20 708.608 «4.563 «6.540 «2.525. «507 489) = 481473. 469 456 
40 1.04 613.526 .486 .467 .454 .438 .423 .416 .409 .405 395 
50 925 .548 .471 .435 .417 .405 .391 .378 .371 .365 .362 353 
60 844 .499 .429 .396 .380 .369 .356 .344 .338 .333 .330 322 
80 730 .432 .371 +.342) «=.328) = 6319-308 = 298 )=— 6292S 6288 =. 285 278 
100 652 .385 .331 .306 .293 .285 .275 .266 .261 .257 = .255 249 











The size of the specified mean is not relevant to the problem; the criti- 
cal figure is a=1.5. In order that the sample size be determined it is 
necessary that the experimenter come to a decision about the magni- 
tudes of B, y, and a. 

Some new tables are required to solve this problem, one for each com- 
bination of values of y and 8. Two are presented here—one for y=.05, 
8=.80, and one for y=.05, 8=.95. The theoretical basis of the tables 
is given in the appendix. The number of degrees of freedom 7 in tie 
second estimate s,? of the variance is found by computing the number 


k= a/s, (2) 


then locating that value in the column headed by m; the number op- 
posite k in the column on the left is the required value of n. In the ex- 
ample given above, k=1.5/3=0.5. This value is found in the column 
headed by 8 in Table I to correspond to n=31, so that the sample 
size would be 32. 

Interpolation in these tables is nearly linear on the reciprocal of m 
and is also nearly linear on the reciprocal of the square root of n. 
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TABLE II 


VALUES OF k=a/s: FOR 6 =.95 
y =.05 FOR ONE-TAILED TESTS, .10 FOR TWO-TAILED TESTS 








a 
3 

- 
bw 
w 
os 
a 
Qo 
oo 
_ 
t) 
_ 
o 
is] 
rs 
wo 
is) 
8 





57.1 19.5 14.4 12.6 11.6 11.0 10.4 9.85 9.58 9.33 9.21 8.86 
24.2 7.74 5.60 4.77 4.39 4.15 3.86 3.61 3.49 3.38 3.33 3.19 
17.6 5.58 4.03 3.39 3.13 2.94 2.74 2.55 2.46 2.39 2.35 2.23 
14.5 4.58 3.28 2.79 2.56 2.40 2.23 2.08 2.01 1.94 1.91 1.82 
12.6 3.97 2.88 2.41 2.23 2.09 1.93 1.82 1.76 1.69 1.66 1.58 
11.2 3.55 2.57 2.17 2.00 1.88 1.73 1.62 1.57 1.52 1.49 1.42 
10.3 3.26 2.36 1.909 1.83 1.72 1.58 1.48 1.43 1. 


onan whe 


1.38 
9.70 3.05 2.19 1.86 1.70 1.60 1.48 1.39 1.34 1.29 
9 9.12 2.87 2.06 1.75 1.60 1.50 1.39 1.30 1.26 1.21 
10 8.62 2.72 1.95 1.65 1.51 1.42 1.32 1.23 1.19 1.15 
12 7.83 2.47 1.77 1.50 1.37 1.29 1.20 1.12 1.08 1.04 
14 7.22 2.28 1 1.38 1.26 1.19 1.11 1.03 .993 .959 .942 893 
16 6.73 2.13 1 ° 893 
18 6.35 2.01 1 842 
20 6.02 1.90 1. 798 
25 5.37 1.70 1 712 
30 4.89 1.54 1 


50 3.78 1.19 854 .722 .661 .622 .577 .537 .518 .500 .492 -469 
60 3.45 1.09 -778 «.658 «=©.602 = «5670—s(«i«w525— 4900S. 472 (45H 4B 428 
80 2.98 .940 .672 .569 .520 .490 .454 .423 .408 .395 .388 -369 
100 2.67 .840 .600 .508  .465 .438 .405 .378 .365 .353 .347 329 











When the computed value of k is smaller than the final entry in the col- 
umn of the table, the sample size will be larger than 100 and may be 
computed as follows: 


n = 100(K/k)?, (3) 


where K is the tabular value for n=100 in the column headed by m. 
In the illustrative example, if a were specified to be 0.6 instead of 1.5, 
the value of k would be 0.2 and n would be found to be 


n = 100(.275/0.2)? = 189. 


The example we have considered was stated so that a one-sided test 
of significance was implied. The experimenter was interested only in 
whether the mean exceeded the given figure, ten; he had no interest in 
a smaller mean however small it might have been. He would compute 
Student’s ¢ from his second sample and if it were greater than ¢.10 
(31) =1.695 he would conclude at the 5% level of significance that the 
mean exceeded ten. We have used the subscript .10 because the tables 
for critical levels of ¢ ordinarily include both tails of the distribution in 





396 AMERICAN STATISTICAL ASSOCIATION JOURNAL SEPTEMBER 1948 
the critical region. The sample size 32 gives him 80% assurance that 
t = »/32 (# — 10)/se (4) 


will exceed 1.695 if in fact the population mean is 11.5 (is the sample 
mean and sz the sample standard deviation). 

If one wishes to make two-tailed tests, then Tables I and II may be 
used provided the level of significance is doubled. Thus, to test whether 
the mean differs significantly in either direction from 10 in the il- 
lustrative example, one could again use the sample size 32 but the test 
would be at the 10% level of significance. 

Now let us consider the possible errors that may be made in the con- 
clusions in relation to 8 and y which are .80 and .10 respectively in the 
problem at hand. If the population mean is ten, then the sample may 
give rise to a value of ¢ larger than 1.695 with probability .05. The total 
probability of obtaining significance in either direction (the type I 
error of Neyman and Pearson) is .10. If the population mean is 8.5, the 
sample may give rise to a value of ¢ which is not smaller than — 1.695, 
and one will fail to detect at the .10 level of significance that the mean 
is smaller than ten; the probability of this error (the type II error of 
Neyman and Pearson) is 1—8=.20 in the present instance. A similar 
situation occurs when the mean is actually 11.5; the probability is .20 
that the sample will fail to show the mean to be significantly larger 
than ten at the .10 level of significance. When a type II error is made it 
will usually consist of the statement that the mean does not differ 
from ten. However there is a remote chance that an extreme error will 
be made—that the population mean will be declared smaller than ten 
when in fact it is 11.5, or declared larger than 10 when it is actually 
8.5. The probability of such an extreme error is about .0006 for Table 
I and about .0002 for Table II. Thus in the given problem, if the true 
mean is 11.5 and one uses a sample of size 32 to test whether the mean 
differs from ten at the .10 level, the probability is about .0006 that the 
sample will give rise to a value of ¢ which is smaller than — 1.695, and 
thus lead one to the conclusion that the population mean is less than 
ten. 

4, DIFFERENCES BETWEEN TWO MEANS 


If one wishes to estimate the difference b between two population 
means knowing that the populations are normal and have the same 
variance, he may draw a sample of size n+1 from each population and 


put 
ied (¢—#’—b)V/n+1 
" V 82? + 82/2 





(5) 
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equal to +t.(2n) and solve for b to obtain the limits of the a confi- 
dence interval. ( and #’ are the two sample means, and s,? and s’;? 
are the two sample variances.) Suppose one wants n+1 to be large 
enough that the probability will be 8 that the confidence interval will be 
shorter than 2d, and that an estimate s,? with m degrees of freedom has 
been previously obtained for one of the populations. The equation 
which determines n is 


d?/2s,? = t*_9(2n)Fi_9(2n, m)/(n + 1) (6) 


as follows directly from equation (1). The divisor 2s,? arises from the 
fact that the variance of the difference between two observations is 
20? if each observation has variance o?. Equation (6) may be solved by 
trial and error for n. 

In testing whether one mean is larger than another by an amount a 
one proceeds exactly as in the preceding section except that the quan- 
tity k to be found in the table is defined by 


k = a/2s,. 


The number n thus determined is the number of degrees of freedom in 
the pooled estimate of the variance; the size of the sample to be drawn 
from each population is therefore (n+2)/2 if n is even or (n+3)/2 if 
n is odd. This sample size will insure that a difference in means of size 
a will be detected at the .05 level with probability .80 or .95 according 
to whether 'Table I or Table II is used to determine n. Here again it is 
assumed that the two populations are normal ‘and have the same 
variance. 


5. REPLICATION OF EXPERIMENTAL DESIGNS 
Tables I and II may be used to aid one in making a decision about the 
number of replications to be used in a designed experiment. Suppose 
p normal populations with means ¢, C2, - + « , Cp and a common vari- 
ance are to be sampled; the means may represent treatment effects, for 
example, in an agricultural or industrial experiment. The most satis- 
factory way to deal with this problem is to have the experimenter specify 
what size treatment sum of squares he wishes to detect; that is, specify 
the size of 
Pp 
De (cx — é)%. 
i=] 
However, the solution of the problem in this form requires triple entry 
tables (the three variables being the numbers of degrees of freedom in 
the preliminary estimate of the error variance, the error variance of 
the experiment, and the treatment sum of squares). Since we had not 
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the facilities to compute such tables, we shall have to consider the 
problem merely in terms of the differences between pairs of means. 

Let us suppose that a number of treatments or treatment combina- 
tions are to be studied in an experimental design with r replications, 
that the number of degrees of freedom in the error variance 8? for the 
proposed experiment is n, that any treatment mean will be based on q 
observations, and that an estimate s,? with m degrees of freedom of the 
error variance is available. Two treatments with means Z; and Z; will 
be compared by computing 


t= Vq (%: — #)/V2 % 


and comparing it with a critical level of ¢ with n degrees of freedom. The 
numbers n and q will be functions of r. 

The size of a difference between treatment means that has an 80% 
chance of being detected at the .05 level (in a given direction) may be 
computed for any proposed value of r by means of the relation, 


on / lt (7) 
q 


where k is the entry in Table I corresponding to m and n. Similarly, 
using Table II to determine k, one may find the difference which has a 
95% chance of being detected at the .05 level. 

As an illustration, suppose six treatments are to be tested in a ran- 
domized block experiment with r replications and that we have an es- 
timate, s,?=9, of the error variance with m=8 degrees of freedom. We 
then have g=r and n=5(r—1). For r=2, for example, a difference in 
treatment means of 


a = 1/6 (1.32)3 = 9.7 


has an 80% chance of being detected in a particular comparison. With 
five replications, a is reduced to 


42 
a = y/ = (.624)3 = 5.4. 


A few trial calculations of this nature for feasible values of r will provide 
one with an objective basis for coming to a decision about the number 
of replications to be used. 


6. USE OF GUESSED ESTIMATES OF THE VARIANCE 


It often happens that while an experimenter has no specific informa- 
tion about the variance of a distribution in the form of an estimate 
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82, he nevertheless has some more or less reliable opinion of its order of 
magnitude. He may, for example, have done considerable experimenta- 
tion with several similar populations and thus have knowledge of the 
general range of variation in such populations. When an experimenter 
has information of this kind he would naturally like to use it, and of 
course he should do so—it will improve the efficiency of his experi- 
ments. 

The question arises as to how one may quantify information of a 
general character; more precisely, how may one convert the informa- 
tion into numerical quantities which may be used for m and s,2? One 
simple method is for the experimenter to place what he feels to be 
reasonable upper and lower limits on the standard deviation. We may 
call these numbers G and H, and let us suppose that the experimenter 
has selected them in an effort to make it a fair nine-to-one bet that they 
will include the population standard deviation. In other words, G and 
H are to be regarded as 90% confidence limits for the standard devi- 
ation. One may then use the average of G and H for 3: 


s, = 1/2(G + H), 


and determine the degrees of freedom, m, to be associated with s; by 
computing G/H and consulting Table III. The second line of this table 
consists merely of the square roots of the ratios of the .95 and .05 levels 
of chi-square. 


TABLE III 


DEGREES OF FREEDOM ASSOCIATED WITH VARIOUS RATIOS 
OF LIMITS ON THE STANDARD DEVIATION 














m 1 2 3 4 5 6 7 8 9 10 
G/H 31.3 7.6 4.7 3.7 3.1 2.8 2.6 2.4 2.3 2.2 
o/h 3.6 2.2 1.84 1.67 1.58 1.51 1.46 1.42 1.39 1.36 





If one has no firm feeling that his upper and lower limits really cor- 
respond to nine-to-one odds, he may prefer to select upper and lower 
limits, g and h, in such a way that he would have no choice between 
sides of an even-money bet that they include the standard deviation. 
Again the average of g and h would be employed as the estimate of 
8, and m would be determined by the third line of Table III which 
was computed using the .75 and .25 levels of chi-square. This method 
is, in fact, to be preferred over the first method because one may 
ordinarily be expected to have a more accurate conception of a fair 
even-money bet than a fair nine-to-one bet. 
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As an example, let us suppose 7 and 12 have been selected as 50% 
limits on a standard deviation. The estimate s, is 9.5, and since the 
ratio g/h is 1.7, there are about 4 degrees of freedom associated with 
the estimate. If one wishes to be conservative he may prefer to call it 
3 degrees of freedom, but ordinarily the nearest figure may as well 
be selected. Only in rare occasions would there be any point in inter- 
polating between 3 and 4, though of course there is no reason why 
a fractional value of m could not be used. 


APPENDIX 


A sample of size n+1 is to be drawn so that the probability will be 
6 that an a confidence interval for the mean will be no longer than 
2d; an estimate s,? of the variance with m degrees of freedom is avail- 
able. If s2? is the sample variance of the new sample, the half-length of 
the confidence interval will be t_2(n)s2/\/n+1; it is desired that n 
be such that 


P[t,_2(n)so/-vV n a 1 < d| = pb (8) 


or 


etiea 
P E < a] as (9) 


tia(n) 
For any 2, it is known that 
P (82 < V Fi_4(n, m)s1) = 8 (10) 


where F_s(n, m) is the critical level of the variance ratio for n and m 
degrees of freedom. On equating the right-hand sides of the inequalities 
in (9) and (10) one obtains the relation given in equation (1) for deter- 
mining n. 

We now turn to the derivation of Tables I and II. Suppose Z is the 
sample mean and that the null hypothesis Ho:u =0 is to be tested against 
alternatives n >0 at the y level of significance. Hy would be rejected if 


Vn-+1 %/ So > toy(n) (11) 


where we have used the subscript 2y in accordance with the usual no- 
tation of ¢ tables. If the mean is actually a>0 it is desired that (11) 
occur with probability 8; we want 


Pl[V/n + 1 &/s2 > tey(n)] = B 
which may be written 
Pt > t(n) — /n+1a/s] = B (12) 
where t=1/n+1(%—a)/s2 has the Student distribution with n degrees 
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of freedom. In (12) we shall put D=+/n+1 a/s; and F =8,2/s,? to ob- 
tain 


Plt > t,(n) — DVF]i= B. (13) 


We shall use this relation to determine a value Dy for given values of 
m and n, then Do/\/n+1 corresponds to a particular value of .a/s; 
given in advance, and n+1 will be the required sample size for that 
value of a/s;. In determining Do we assume ¢ and F have a distribution 
determined by the distributions of Z, s:, s2. This is legitimate for only 
one use of s;; clearly for repeated use of s; to determine sample sizes 
one would have to evaluate D from (13) using the conditional distribu- 
tion of F given s;, and that conditional distribution involves the un- 
known variance. We thus in effect treat the problem as if the question 
of determining the same size arose in advance of the information in s;. 
(An excellent discussion of this general problem has been given by 
Stein in reference 4.) 

The variates ¢t and F in (13) are not independent since they are both 
functions of s2; however their joint distribution is readily found as 
follows: Since Z, s:, and s2, are mutually independent their joint density 
function is the product of their individual densities. On transforming 
the three variates to ¢, F, and se, and then integrating out s2 one finds 
the density of ¢ and F to be 


f(t, F) = CF—!2/(¢2 + mF + n)mtatvl2 (14) 


C being a function of m and n. 
To facilitate the numerical integration needed to evaluate the in- 
tegral implied by (13), further transformations were made. If one lets 


xz? = (m+ n)i?/(mF + n) 











y = mF /(mF + n) 
the density function reduces to 
{ (r a) 7 
(x,y) = eis : . (m+n-+1)/2 
— V/x(m+n) (= x? 
r ) (1 + 
5 2 m+n a 
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so that x and y are independently distributed and by tabulated func- 
tions, x having the Student distribution and y the beta distribution. 
Nevertheless one numerical integration is required to evaluate the 
probability in (13), and since we fixed 8 in advance it was necessary 
to do several such integrations for trial values of D to determine each 
entry in Tables I and II. D/.\/n+1 rather than D is tabulated. The 
entries corresponding to m=1, 2, 4, 8, 16, 32 and n=1, 2, 4, 16, 32, 64, 
128 were determined by integration; the other entries were filled in by 
interpolation. The level of accuracy of the tabular values is about one 
in 300; an entry such as 9.12 may be in error by three units in the last 
place. The entries in the right hand column of each table were taken 
from the Neyman and Tokarska (reference 3) table of the power of the 
t test. 
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APPLICATION OF THE THEORY OF EXTREME 
VALUES IN FRACTURE PROBLEMS* 


BENJAMIN EpstEIN 
Coal Research Laboratory, Carnegie Institute of Technology 


In this paper it is shown that the theory of extreme values 
is pertinent to the treatment of certain aspects of the fracture 
or break-down of materials used in modern technology. An at- 
tempt is made to integrate some of the results scattered 
through the technical literature. 


INTRODUCTION 


URING THE PAST twenty-five years there has been an increasing in- 

terest in gaining a deeper insight into the laws governing some of 
the phenomena observed in connection with the fracturing of metals, 
textiles, glass and other materials used in modern technology. Among 
other things, one would like to account for such things as the skewness 
of the distribution of breaking strengths of supposedly identical sam- 
ples and the dependence of strength on specimen size. 

It appears, from a survey of the literature on this subject, that work- 
ers in many different fields have obtained special results without realiz- 
ing fully the general setting of the problem. It is our purpose in this 
paper to indicate how the application of the theory of extreme values 
leads to a resolution of some of the pertinent questions. 


SURVEY OF FIELD 


In essence, all the statistical models proposed in the study of frac- 
ture take as a starting point Griffith’s theory' which states that the 
difference between the calculated strengths of materials and those ac- 
tually observed resides in the fact that there exist flaws in the body 
which weaken it. Accepting this point of view is equivalent to assuming 
that there will be a distribution of strengths in a given specimen in the 
sense that a different amount of force wiii be needed to fracture a speci- 
men at one or another point. There are many physical situations where 
failure at one point means failure of the entire specimen. But this simply 
means statistically that it is the worst flaw among N flaws (where N 
is the number of flaws in the specimen) which determines the strength 
of a specimen and therefore the theory of extreme values is immediately 





* Presented at the 107th Annual Meeting of the American Statistical Association, December 30 
1947, in connection with the program on “The Theory of Extreme Values and Its Applications.” 

1A. A. Griffith, “The Phenomena of Rupture and Flow in Solids,” Philosophical Transactions of the 
Royal Society, 221A, 163 (1920); “The Theory of Rupture,” Proceedings of the First International Con- 
gress for Applied Mechanics, 1, Delft, p. 55 (1924). 
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applicable. Clearly N increases as the specimen size increases and, 
therefore, the dependence of strength on specimen size is equivalent 
statistically to the problem of how the distribution of smallest (or 
largest) values depends on N. Generally speaking, asymptotic theory 
is applicable since N, the number of flaws, is large in most practical 
problems. 

As far as is known to the author, the first worker to realize the con- 
nection between specimen strength and distribution of extreme values 
was F. T. Peirce? of the British Cotton Industry Association. The ap- 
plication of essentially the same ideas to the study of the strength of 
materials was carried out by the Swedish physicist and engineer, 
Weibull.* He, unlike Peirce who assumed a Gaussian distribution of 
strengths in the locality of flaws, made the assumption that F,(S), 
the probability of breakage as a function of the stress S is given by 
F(S) =1—e-S/S0”, where So and m are unknown parameters depend- 
ing on the characteristics of the material under test. Experimental data, 
drawn from the fields of textiles, metallurgy, dielectrics, and metal 
fatigue indicate that Weibull’s assumptions are reasonable. Recently 
his views were partially corroborated by Davidenkow, Shevandin and 
Wittman.‘ 

The Russian physicists, Frenkel and Kontorova,' were the next to 
study these problems. Starting with a distribution of flaw strengths 
x given by the probability density function, 


f(z) = ——— e—(2-W)* Ie 

() 2ro 
they found the most probable value of the strength of specimens of 
volume V large enough to contain many flaws is approximately given 
by 





(1) p— ovV/2 log nV — 2 log 2\/r 


where 7 is the average number of flaws per cubic centimeter of the 
material. This result states that the strength of specimens under the 
Frenkel-Kontorova assumptions should decrease linearly with ~/log V. 





2 F. T. Peirce, “Tensile Tests for Cotton Yarns V. ‘The Weakest Link’—Theorems on the Strength 
of Long and of Composite Specimens,” Journal of the Textile Institute, Transactions, 17, 355 (1926). 

3 W. Weibull, “A Statistical Theory of the Strength of Materials,” Ing. Vetenskaps Akad. Handl., 
No. 151 (1939); “The Phenomenon of Rupture in Solids,” Ibid., No. 153 (1939). See also John Tucker, 
“Statistical Theory of the Effect of the Dimensions and Method of Loading upon the Modulus of Rup- 
ture of Beams,” Proceedings of the American Society of Testing Materials, 41, 1072 (1941). 

4 N. Davidenkow, E. Shevandin and F. Wittman, “The Influence of Size on the Brittle Strength of 
Steel,” Journal of Applied Mechanics, 14, No. 1, A63-67 (1947). 

6’ T. A. Kontorova, J. Tech. Phys. U.S.S.R., 10, 886 (1940); J. I. Frenkel and T. A. Kontorova, “A 
Statistical Theory of the Brittle Strength of Real Crystals,” J. Phys. U.S.S.R., 7, 108-(1943). 
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This should be compared with Weibull’s result to the effect that the 
most probable value of the strength of specimens of volume V is pro- 
portional to SoV-“™. Frenkel and Kontorova consider the discovery 
of (1) to be very important and to justify the assumption that the 
local distribution of strengths due to flaws is Gaussian. Statisticians 
familiar with the theory of extreme values will recognize (1) as due in 
more precise form to R. A. Fisher and Tippett® and Gumbel.’ The more 
precise formula for (1) is 


log log nV + 4x 
— o/21 V fo ae 
ae Se 2\/2 log nV 





Beginning with World War II several scientists in this country be- 
came interested in statistical theories of fracture. Some work was done 
by Ruark and Rosen® of the University of North Carolina under con- 
tract with the Navy Department. More recently two metallurgists, 
J. C. Fisher and Hollomon,® have proposed a statistical theory of 
fracture based on the idea that the material under study is an elastic 
solid containing many thin disc-like cracks with elliptical cross-sections. 
The major axes of the ellipses are assumed to be of size x possessing 
distribution p(x) =he-**, x=0. Brooks'® of the Westinghouse Electric 
Corporation has presented experimental evidence indicating the im- 
portance of the theory of extreme values in understanding the break- 
down of paper capacitors. Collaboration between Brooks and the 
author" has resulted in a paper which contains results useful to design 
engineers in the field of dielectrics. It is shown, among other things, 
that the most probable value of the breakdown strength V(A) of 
capacitors of area A is related to A by a relationship of the form 


(2) V(A) =a — Blog A, 


where a and @ are positive constants determined by the number of 
flaws (technically known as conducting particles) per unit area and by 
the particle size distribution. 





* R.A. Fisher and L. H. C. Tippett, “Limiting Form of the Frequency Distribution of the Largest 
and Smallest Member of a Sample,” Proceedings of the Cambridge Philosophical Society, 24, 180 (1928). 

7E. J. Gumbel, “Les valeurs extrémes des distributions statistiques,” Annales de l'Institut Henri 
Poincaré, 5, 115 (1936). 

8A. E. Ruark and N. Rosen, “Statistical Effects in the Testing of Materials,” Naval Research 
Laboratory Report No. 0-2191. 

° J.C. Fisher and J. H. Hollomon, “A Statistical Theory of Fracture,” American Institute of Min- 
ing and Metallurgical Engineers, Tech. Pub. No. 2218, Metals Technology, August 1947. 

10 H. Brooks, “The Probable Breakdown Voltage of Paper Dielectric Capacitors,” American Insti- 
tute of Electrical Engineers, Technical Paper, 47-164, 1947. 

1 B. Epstein and H. Brooks, “The Theory of Extreme Values and Its Implications in the Study of 
the Dielectric Strength of Paper Capacitors,” accepted by the Journal of Applied Physics. 
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All of the papers mentioned thus far were based essentially on the 
concept that fracture is determined by the weakest link. A much more 
complicated and very interesting problem has recently been considered 
by H. E. Daniels.” This paper is devoted to studying the strength of 
bundles of threads where the load S is now applied to a set of elements 
in parallel and not in series. Whereas in the series case each element 
carries the full load S, in the parallel case each thread initially carries 
load S/n. It turns out that a bundle will not break under a load S if, 
and only if, there exists an integer k Sn such that k among the threads 
have strength exceeding S/k. The problem of the distribution of strength 
of bundles of n threads can no longer be treated as a problem in the 
distribution of smallest values. It turns out that whereas the distribu- 
tion of strengths of single long strands is negatively skewed, the distri- 
bution of strengths of bundles of n threads is asymptotically normal 
for large n under fairly general assumptions about the distribution of 
strengths of individual strands. 

Daniels’ result has been mentioned here even though it is not a 
purely “weakest link” problem because of-its possible importance in 
studying certain aspects of the strength of materials. In any real prob- 
lem the situation is probably neither precisely described by elements 
in series or parallel, but rather by elements distributed in some rather 
complicated way. However, it is of value to know what to expect in 
certain limiting idealized models. 


THE IMPLICATIONS OF VARIOUS DISTRIBUTION ASSUMPTIONS 
IN BREAKDOWN PROBLEMS 


There is considerable dispute in the technical literature as to which 
assumptions concerning the distribution of flaw strengths or distribu- 
tions of sizes of cracks or conducting particles are the correct ones. Of 
course, this sort of question cannot be answered purely on statistical 
grounds. Adequate experimental data of the right type must be avail- 
able before one can decide, in a reasonably rigorous way, what the 
underlying distribution laws are. At the moment the situation seems 
to be as follows: (a) there is a certain amount of data drawn from a 
variety of fields which partially substantiates Weibull’s assumptions; 
(b) many experimenters have observed that the most probable value 
of the strength decreases as some function of’the logarithm of the size 
of specimen; (c) the distribution of strengths of specimens all of the 
same size appears to be negatively skewed; (d) in the breakdown of 





12 H. E. Daniels, “The Statistical Theory of the Strength of Bundles of Threads. I,” Proceedings of 
the Royal Society, London, 183, 405, 1945. 
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capacitors there appears to be ample justification for assuming that the 
sizes of conducting particles are distributed according to a size distri- 
bution of the form f(z) =e, z>0. It can then easily be shown that 
the most probable value of the breakdown voltage will decrease lin- 
early with the logarithm of the area. Physical data substantiate this 
and other conclusions. 

The physical implications of various distribution assumptions are 
discussed in a recent paper.” In treating such questions one is generally 
justified in working with the asymptotic portion of extreme value 
theory since the number of defects involved in fracture problems is 
large. It is a very important fact that the distribution of extreme values 
in samples of size n (n— ©) possesses a kind of stability with n playing 
in a certain sense the role of a parameter. It is shown, for example, in 
Cramér“ that if samples of size n are drawn from a population de- 
scribed by the continuous p.d.f. f(z) and associated d.f. F(x) = 7. f(z) 
dz, then the asymptotic frequency distribution of y, the smallest value 
in samples of size n(n— ©) is given by 

(3) h(y) = nf(yje"*™™. 
The d.f. is 
(4) H(y) =1—e-F, 


It is convenient to introduce a random variable 7 which is related to 
the random variable y by the relationship 7»=nF(y), OSnSn. (3) 
then assumes the simple form 


(5) h(nh=e", OSnSn. 


Similarly, the asymptotic p.d.f. of z, the largest value in sample of size 
n, is given by 


(6) h(a) = nf(z)e-m-Feon, 
The asymptotic d.f. is 
(7) H (x) = e~*'-F@)1, 


If a random variable §=n[1—F(z)] is introduced, then (6) becomes 
(83) ACE) Het OSES. 


Equations (3) through (8) form a basis from which a number of 
physical conclusions of interest can be drawn. Equation (4) has 4 
direct physical interpretation in the class of problems under considera- 

13 B. Epstein, “Statistical Aspects of Fracture Problems,” Journal of Applied Physics, 19, 140-147 


(1948). 
4 H. Cramér, Mathematical Methods of Statistics, Princeton University Press, pp. 373-378, 1946. 
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tion. Imagine that one is breaking one-dimensional threads possessing 
A flaws per unit length and with flaw strength distribution f(x). Then 
AL is the expected number of flaws in a specimen of length L and 
ALF(z) is the expected number of flaws having strength <z in speci- 
mens of length L. But, as is presumably (but not necessarily) the case, 
the flaws are distributed individually and collectively at random, and 
therefore it can be assumed that the number of flaws having strength 
<z in specimens of length L is a random variable distributed according 
to the law of Poisson. Therefore, the expected fraction of specimens of 
length L having strength <z is given exactly and not merely asymp- 
totically by 


(9) H(z) =1 — e42F@), 


The identification of (9) with (4) is complete since AL plays the role 
of the sample size n. A similar interpretation can be attached to (7). 

If f(x) is assumed to be one of a number of common distribution 
functions, the application of (4) and (5) yields the results in Table I. 
The results for the first two distributions will be found in Cramér.™ 


TABLE I 








Asymptotic Distribution of Smallest Values 


Probability Density Functions in Samples of Size n (n large) 





; 3 mageees distribution 
f(a) = — ele Hla vy =n —dlogn/2 +r1log a 
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2. Gauss’ distribution 


2,2 i ati 
f(z) = Jase e (wy /2¢ y=np—ov2logn 
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o< log log n + log 4x ologn 
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log log n + 4x of 
2V2logn V2 log n 

3. Distribution of type assumed by Weibull y= (n/na)'8 
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or 


f(z) = apes 8,» > 0,0 >0,8 >1 log y = — 1/8 log na + £/8. 





n is distributed with p.d.f. h(n) =é77, 7 20 and ¢ isdistributed with p.d.f. g(t) adee, —a<f<o 


Since ¢, the random portion of y, the smallest value in samples of 
size n, is distributed as ef e~* a distribution which is strongly skewed 
to the left, it can be said that: (a) if the distribution of flaw strengths 
follows the Laplace law, Gaussian law, or more generally one of the 
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type Ae~® |=-#I” for large values of | xL— u| (where A, B, yw, and pare posi- 
tive constants), then the distribution of strengths of samples all of the 
same size is negatively skewed. More precisely, if P(y) is the fraction of 
specimens of strength <y, then log log 1/1—P(y) isa linear function of 
y; (b) if the distribution of flaw strengths follows a Weibull distribution 
then the distribution of the logarithms of the strength of samples all 
of the same size is negatively skewed. More precisely, if P(y) is the 
fraction of specimens of strength Sy, then log log 1/1—P(y) is a linear 
function of log y. 

There are physical data at present available which follow each of 
these patterns. For example, the distribution of breakdown voltages 
of capacitors!®" all of the same size is of type (a). Weibull? gives several 
examples, drawn from a number of fields, which are of type (b). 

In the study of size effect, i.e., how specimen size affects the distribu- 
tion of strengths, it is important to see what happens in Table I when 
the size of sample changes. In particular, it is of interest to see how the 
most probable value (i.e., the mode), the mean, and variance of the 
smallest value in samples of size n change with n. To do this note that 


(10) The most probable value of ¢ = ¢ = 0, 
The mean value of ¢ = E(g) = f (log z)e~*dz 
0 


= —.577 and 
The variance of ¢ = E(¢?) — [E(¢)]? 


f (log z)e—*dz 
0 


— | J "ee sere | 


= x*/6, 
and (11) The most probable value of 7/6 is [(8 — 1)/B]8, 


1 
E(n*) =T (—) and 


, are) P(E) 
. (qt) = T(—— ) — | r(——} ] . 
var. (1/8) (= B 


It then follows that 
(12) § =p — dlog (n/2), 9 = § — 0.5772, 
oy? = d*r?/6, 
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if the original population is a Laplace distribution with parameters 
(u, d), 


. log log n + 4x 
13 =p-— oV21 : 
— Peeoe site 2/2 log n 


5 = § — 0.5770/V/2 log n, 


oy? = o*n?/12 log n, 





if the original population is Gaussian with parameters (y, o), and 


1 B— 1\ V6 
(14) j= ( ) = gn-"8, 








(any \ 
g=P (° =)/(n, 


fe) EF} Jom 


if the original population is of the Weibull type with parameters 
(a, 8). denotes the mode of the original Weibull distribution. 

Equations (12), (13) and (14), when interpreted physically, give 
definite information on how the size effect depends on the distribution 
of strengths in the vicinity of flaws. In each instance the specimens be- 
come weaker as V, the specimen size, increases. In the first two cases 
the relationships are both semi-logarithmic in character (y decreases 
linearly with log V in the Laplace case and with ~/log V in the Gauss 
case). However, the spread of the distribution remains unchanged in 
the Laplace case, but becomes narrower with increasing V in the Gauss 
case. Under the Weibull assumption log y (as well as log y) decreases 
linearly with log V and the spread also decreases. 

It can be shown similarly that if the flaw strength distribution is of 
the form Ae~?!*-«!? for large values of |z—p| (where A, B, pu, and p are 
positive constants), then the most probable value of the strength will 
decrease proportionately to (log n)/?. If the distribution is rectangular 
then there would be no size effect. The results in this paragraph, in 
Table I, and in (12), (13), and (14) furnish a statistical basis for testing 
which assumptions on the underlying flaw strength distributions are 
reasonable. In the last analysis definitive answers of physical interest 
must await the accumulation of more pertinent physical data. Some of 
the techniques used in analyzing such data are closely related to those 
used by Gumbel on the theory of floods and distribution of oldest ages 
at death." 
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It should be mentioned that in some strength problems it is conven- 
ient to work with the distribution of largest values in preference to the 
distribution of smallest values. This is particularly true in studying the 
breakdown strength of capacitors. It is fairly well established that one 
of the major causes of breakdown of capacitors is the presence of flaws 
known technically as conducting particles. These particles are present 
no matter how carefully capacitor paper is made or how carefully the 
capacitor itself is manufactured. The conducting particles are spread 
at random throughout the area of the capacitor and, depending on their 
size and orientation, create a local weakening of the capacitor by re- 
ducing the nominal insulation thickness in the neighborhood of the 
flaws. But the voltage required to break down a capacitor is equal to 
the voltage required to break it down at its weakest point, or equiva- 
lently at ‘he place where the greatest effective penetration of the in- 
sulation has occurred. Therefore, the distribution of largest conducting 
particle sizes is pertinent in this problem. Working with largest particle 
sizes in a sample also appears to be a good way to treat the metallurgical 
problem of Fisher and Hollomon.® If this is done it turns out that their 
statistical and physical assumptions taken together imply that the 
strength of specimens of size V should fall off as the reciprocal of 
Vlog V. 





CONCLUSIONS AND GENERAL REMARKS 


When the problem of fracture is viewed in its most general setting it 
is clear that the ‘‘weakest link” concept, when applicable, may be the 
key to the occurrence of certain observed relationships between the 
strength of specimens and their size. The term strength is meant to be 
quite general in scope: we may have in mind mechanical strength, elec- 
trical strength, resistance of painted specimens to the corrosive effects 
of the atmosphere, or perhaps the life span of a device which ceases to 
function when any one of a number of vital parts breaks down. From 
a statistical point of view phenomena of this type, though different at 
first glance, are really equivalent since they lead in each instance to 
the same problem, namely, the distribution of smallest values in large 
samples. This may account for the fact that workers in many different 
fields have observed skewed strength distributions and approximately 
semi-logarithmic dependence on size. As a matter of fact, the consider- 





% E. J. Gumbel, “The Return Period of Flood Flows,” Annals of Math. Stat., 12, 163, 1941. 
16 E. J. Gumbel, ‘‘La durée extréme de la vie humaine,” Actualités Scientifiques et Industrielles, 
Hermann et Cie, Paris, 1937. 
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ations in this paper indicate that under certain hypotheses there is a 
statistical compulsion for such results. 

Many scientists accept the fact that the Gaussian distribution plays 
a fundamental role in science, and, in fact, there are many who feel 
that this is the only distribution which nature calls truly her own. It has 
been shown in this paper, however, that in a certain class of phenomena 
the typical distributions are far from normal, and are, in fact, often 
strongly skewed to the left. There are other instances lying outside the 
scope of the present considerations where still other “characteristic” 
distributions are found. For example, the particle sizes arising from 
crushing and grinding have a marked tendency to follow a logarith- 
mico-normal form. 

In conclusion we wish to emphasize that the “weakest link” hypoth- 
esis cannot be expected to explain all phenomena related to strength 
of materials. In this paper it has been assumed that the flaws are dis- 
tributed at random and do not influence one another in any way. 
Either of these assumptions may be totally or in part untrue in certain 
physical situations. For example, putting a notch on the surface of a 
beam subjected to bending may cause the strength to drop a great deal. 
However, if many notches are put on the surface, there is scarcely any 
effect on strength. This fact is of importance in the fatigue strength 
of a body having a highly polished surface. A single scratch may be 
ruinous whereas roughing up the whole surface with many scratches 
may bring about little change in the strength. 

Weibull? and Daniels” have obtained some interesting results relat- 
ing to more general situations but much remains to be done. 

































PROBLEMS WITH SAMPLING PROCEDURES FOR 
RESERVE VALUATIONS* 


GrorGE C. CAMPBELL 
Metropolitan Life Insurance Company 


The computation of the reserve liability, running up to 
about $49 billion in North American life insurance companies, 
involves a large amount of work, some of which might be saved 
if sampling methods were satisfactory. Certain possible ap- 
proaches to the sampling problem are mentioned, and the 
mathematical problem is developed for one such approach. 
Experimental results are presented from one small exploratory 
sample intended to establish orders of magnitude. Certain ten- 
tative conclusions are presented. 


T 1s well known that the death rate is very low in the teens; that it is 

fairly low but increasing slowly up to the middle ages; and that it 
then increases rapidly through the very old ages. Theoretically, it is 
possible to increase the premium for life insurance each year of age in 
accordance with the risk of death in that year, starting with a very 
low premium in the beginning and reaching very high annual premiums 
late in life. Such plans have been tried and found impractical. Large and 
rapidly increasing premiums late in life are unrealistic in relation to the 
policyholder’s income. 

Very early in the insurance business it was found more practical to 
sell insurance on a level annual premium basis during the premium- 
paying period. This means that the level annual premiums are more 
than necessary to pay expected claims in the early years of the insur- 
ance. These amounts in excess of claims must be saved and invested 
to provide a fund for the later years of insurance when the expected 
claims will exceed the premium collections. The theory is relatively 
simple for the computation of this fund which is called the reserve. 
When a policy is issued, the present value of future net premiums, after 
deducting the loading for expenses in the gross premium, is exactly 
equal to the present value of the benefit to be paid. In computing pres- 
ent value, future payments are discounted to the present time on the 
basis of interest and the probability of survival according to a mortality 
table. Considering the case where the risk of death is increasing with 
age, if the present value of the level benefit is exactly equal to the pres- 
ent value of the net level premiums when a policy is issued, then at 
any time subsequent to issue, the present value of the benefit will 
exceed the present value of the future net premiums. The excess of 





* A paper presented at the 107th Annual Meeting of the American Statistical Association, New 


York City, December 30, 1947. 
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the present value of the benefit over the present value of future net 
premiums at any point in time is the reserve on the policy which the 
Company must set up in order to meet its commitments.' The reserve 
is a definite liability of the Company which must be shown on the 
balance sheet as required by law. 

After the mortality table and the rate of interest have been selected 
to satisfy legal and managerial requirements, the average net level 
premium reserve per $1 of insurance is uniquely determined for each 
type of policy by the contractual provisions of the policy (including 
statutory references), by the duration from issue, and by the age at 
issue. In the Metropolitan, such a classification leads to about 200,000 
individual ce!ls for Ordinary and Industrial Life Insurance. It is nec- 
essary to ascertain the amount of insurance in force (i.e., in being) fall- 
ing into each of these cells. A large amount of work is expended in the 
classification of all current transactions, such as new policies issued, 
lapses, revivals, death claims, maturities, etc., into these 200,000 cells 
in order to carry forward the classification from one year to the 
next. Even after the insurance in force is obtained in this detail, it is 
necessary then to multiply the amount of insurance in each one of 
these cells by the appropriate reserve factor per dollar of insurance 
to obtain the corresponding reserve liabilities which in turn must be 
summarized into final figures. All of this work must be thoroughly 
checked in detail and by controls to insure accuracy. 

The determination of the reserve liability for all of the policies the 
Company has outstanding is generally referred to as reserve valuation. 
Separate calculations are made for the several branches of business, 
and for related benefits. The amounts involved are very large and con- 
stitute the principal liability of any life insurance company. For the 
business at large, in North America at the end of 1947, the total assets 
amount to about $55 billion of which about $49 billion is required to 
cover reserve liabilities. In any one of the larger companies, the reserve 
may well run to a few billion dollars. It is clear that the reserve valua- 
tion problem is of great practical importance and that it involves con- 
siderable labor and expense. It should not be assumed, however, that 
most of this work could be eliminated simply by eliminating or modify- 
ing the requirements for reserve valuation. Much of the same informa- 
tion would still be needed for mortality studies and for other purposes. 
If it were possible to obtain the reserve liability from a reasonably small 
sample with sufficient accuracy to protect adequately the policyholders 
and to be accepted by Company management and by State supervisory 





1 For details regarding net premium and reserve computations, see E. F. Spurgeon, Life Contingen- 
cies, Cambridge University Press, 1932. 
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authorities, a considerable saving in Home Office expenses could be 
made. 
THE MATHEMATICAL PROBLEM OF SAMPLING 

When we approach the valuation problem from the sampling point 
of view, a number of possibilities present themselves. The maximum 
saving in clerical labor could be made if a satisfactory approximation 
could be obtained by sampling from a file of cards representing policies 
in force without the need for recourse to any running transaction 
accounts whatsoever. This would mean, for a particular block of busi- 
ness, that we would require reliable estimates from a sample alone 
for a number of policies, the amount of insurance, and the reserve, 
together with their sampling variances. 

It seems unlikely, however, that Company management would be 
willing to discard. all control accounts on the number of policies 
and on the amount of insurance. Consequently, it seems more realistic 
to approach the sampling problem on the assumption that certain con- 
trol accounts would be kept, giving the number of policies and the 
amount of insurance in force accurately for certain blocks of business 
in convenient sub-groups. The problem reduces to finding the average 
reserve per $1 insurance, and its variance, from a sample separated into 
corresponding sub-groups. The reserve would then be estimated for 
each sub-group from the actual total amount of insurance and the 
average reserve per $1 insurance as estimated from the sample. 

It appears desirable to take advantage of the high correlation exist- 
ing between the amount of reserve and the amount of insurance 
on groups of individual policies. There is an exact mathematical rela- 
tionship for individual policies, and a correlation coefficient. of the order 
of 0.7 for the aggregate group of Regular Ordinary policies in the 
Metropolitan. In most of the sampling problems related to valuation 
studied by the writer, the solution depends on the relation of the quan- 
tity being estimated (in this case, the reserve) either to the amount 
of insurance or to the amount of gross annual premium. This rela- 
tionship leads to a two-way scatter diagram. 

Consider the two-way scatter diagram presented as Table I with the 
reserve factors per $1 of insurance along the X-axis and the amount of 
insurance along the Y-axis. In the mathematical derivation we must 
consider the related scatter diagram formed by the amount of reserve 
along the X-axis and the amount of insurance along the Y-axis. For- 
tunately, numerical results for the second scatter diagram can be ob- 
tained from the first scatter diagram based on reserve factors by 
taking the appropriate cross product between the reserve factor and 
the amount of insurance whenever the amount of reserve is needed. 
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The average reserve per $1 insurance in a sub-group of any particu- 
lar block of business is obtained readily by summing the reserve on all 
of the policies in the sub-group and dividing by the sum of the insur- 
ance on all of its policies. An estimate is required for the variance of 
the average reserve per $1 of insurance. 

Let V =the average reserve per $1 insurance 

R=the average reserve per policy 
S =the average insurance per policy 








Then 
V = R/S for the universe (1) 
and 
V+AV he on f l 2 
+ = F+as or a sample (2) 
Hence 
r RAR AS , AS 7"! . 
aha <a " 
and 
Ay)? = R?7(AR)? _ , ARAS Q ~ E ‘ =|. 4) 
rt. = RS »? Ss 





Summing for all samples, dividing by the number of samples, and 
taking unity as a reasonable approximation to the last factor 


=| o2 9 ORosr 4 o2 ] (5) 
7 2 Rs | = 








Replacing the variances of averages by the variances of individual pol- 
icies, we have the approximate formula 


, R? i~ 9 ORrosl 4 o? ] (6) 
7 nS? R? RS 3 








o 





where n is the number of policies in the sample and r is the correlation 
coefficient between RF and S. 

The above expression enables us to compute an approximate value 
of the required variance of the average reserve per $1 of insurance from 
the scatter diagram. 


EXPERIMENTAL WORK 


Some exploratory work was done with a sample of 1,015 Regular 
Ordinary policies drawn from a total in force of about 4.3 million poli- 
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cies for $9.6 billion of insurance with $2.0 billion of reserve liability. 
It was thought that a small exploratory sample would serve as a pilot 
study to establish certain orders of magnitude which might be helpful 
in planning any subsequent investigation of the sampling problem. 
There is no intention to use such a small sample for actual valuation 
purposes. 

A sample from this block of business was taken out near the end of 
1946 for another purpose by counting through the renewal cards and 
picking every 50th card. The sample renewal cards were microfilmed 
in order to return them to the active file immediately. The exploratory 
valuation sample of 1,015 cards was obtained from the microfilm file 
by running it through the viewing machine and stopping the film at 
every second turn of the crank. The facts from the projection in view 
were copied by hand on a separate card for this study. 

The total sample of 1,015 cards was used to determine an average 
reserve per $1 of insurance by valuing the individual cards seriatim and 
adding both the insurance and the reserve on the cards to form two 
totals. The quotient gave 0.21515 for the average reserve per $1 of 
insurance. Applying this average to the actual total insurance amount- 
ing to $9,587 million, an estimated reserve of $2,063 million was ob- 
tained which is $53 million in excess of the true reserve. This gives 0.69 
as the ratio of the error to the standard deviation of $77 million esti- 
mated from the sample. Hence by this method of sampling, results to 
this degree of accuracy or better would be obtained in about half of 
repeated applications. 

Individual policies, rather than grouped frequencies from the scatter 
diagram, were used for the numerical work because of the desire to 
experiment with various subdivisions. In order to see how the averages 
would behave in repeated samples, the sample of 1,015 cards was fur- 
ther subdivided into ten groups, each group consisting of all the cards 
with policy numbers ending in the same digit. Table II, giving the 
results from this experiment, shows the average reserves per dollar of 
insurance, the total reserve for this block of business estimated from 
each of the ten sample averages, and the resulting errors obtained by 
comparison with the actual reserve. 

Theactual reserve fromthe regular arithmetical calculation amounted 
to $2,010 million. The best estimate from the small sample of 111 
cards was $34 million under the true value, while the worst estimate 
obtained from a sample of 93 cards was $418 million over the true 
value. It would be expected that 5 of the ratios of error to the standard 
deviation would be under 0.6745, disregarding signs, and that 5 would 
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TABLE II 
BEHAVIOR OF AVERAGES FROM REPEATED SAMPLES 
Reserve Esti- . 
. Number Average mated from Error Ratio of 
Final a Reserve : Error to 
oe Policies Sample Average , (Estimate 
Digit ‘ per $1 Standard 
in Sample and Actual less Actual) age 
Insurance Deviationt 
Insurance* 
(Millions) (Millions) 
0 105 .2279 $2,185 175 0.73 
1 93 -2532 2,428 418 1.65 
2 109 -1952 1,871 —139 —0.59 
3 111 -2062 1,976 — 34 —0.15 
4 108 -2517 2,413 403 be | 
5 81 -1959 1,878 —132 —0.49 
6 118 -2158 2,068 58 0.26 
7 110 -1898 1,820 —190 —0.82 
8 87 -2372 2,274 264 1.01 
9 93 -1712 1,642 —368 —1.45 
True Values - 20966 $2,010 

















* Actual Insurance $9,586.83 million. 

¢ Standard Deviation estimated from total sample of 1,015 policies adjusted for size of individual 
samples. 
exceed this value; actually there were 4 under and 6 over. Considering 
20 per cent frequency intervals from the lower limit to the upper limit 
of the distribution, there were 1, 3, 1, 2, 3 ratios in each interval com- 
pared with the expected 2. 

The subdivision of the sample of 1,015 cards into ten random sub- 
groups illustrates that even with a sample of the order of 100 cards, 
it is possible to estimate the reserve, and to predict the range of errors 
with reasonable certainty because the errors, mezsured in terms of 
their expected standard deviation, appear to be distributed in reason- 
ably close agreement with the normal frequency curve. 


EXPERIMENTAL STRATIFICATION 


In the above computation, the policies in the sample were grouped 
together. It would be natural to inquire if the result might be improved 
by stratifying the sample. 

It is easy to select a valuation sample stratified by years of issue 
since policy cards usually are filed in numerical order. This breaks 
them approximately by years of issue. Any other stratification, such 
as by plans or groups of plans, would be difficult to handle if the cards 
had to be selected initially on that basis. With some sacrifice of poten- 
tial efficiency, however, the sample might be stratified after selection 
for application against such control totals. 

An experiment was tried with stratification by year of issue with all 
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plans combined in each year of issue. Average reserve factors per $1 
of insurance from the sample were applied to the tota] business by 
years of issue. This method of estimating the reserve by a stratified 
sample reduced the standard deviation from $77 million to $47 million, 
and reduced the error from $53 million to $19 million, giving a ratio of 
error to standard deviation of 0.41. This was the best result produced 
from this sample. The standard deviation of this estimate indicates 
that the error would be within about $32 million half of the time from 
repeated samples of 1,015 policies selected in this way. There would be 
a 5 per cent risk of exceeding an error of $92 million. The results from 
this experiment are shown in some detail in Table III. 

It is interesting to note that, in the last column showing the ratios 
of the errors to the corresponding standard deviations, there are 12 














TABLE III 
EXPERIMENTAL SAMPLE STRATIFIED BY YEAR OF ISSUE 
Average Reserve 
Num | Reserve | “otis! (Millions) Estimated*| Ratio 
Year of Policies per $1 een l Standard | Error to 
Issue rv Insurance Peay ero Estimated Deviation | Standard 
Sample From (Millions) From Actual Error (Millions) | Deviation 
Sample Sample 
1945 71 -04146 $775.44 32.15 29.30 2.85 4.07 -70 
1944 60 .05718 663.29 37.93 37.11 -82 4.08 -20 
1943 37 26319 540.93 34.18 37.21 — 3.03 6.66 — .46 
1942 55 -08048 438.58 35.30 36.56 — 1.26 4.33 — .29 
1941 82 . 11578 595.02 68.89 61.08 7.81 10.37 75 
1940 59 - 12925 440.67 56.96 54.19 2.77 4.88 .57 
1939 59 - 13596 425.47 57.85 59.33 — 1.48 7.43 — .20 
1938 48 13875 408.94 56.74 67.40 —10.66 11.23 — .95 
1937 59 -17848 476.37 85.02 87.15 — 2.13 8.97 — .24 
1936 42 16573 439.46 72.83 90.42 —17.59 5.98 —2.94 
1935 45 . 22862 418.48 95.67 95.60 .07 7.23 01 
1934 50 - 20800 463.69 96.45 112.15 —15.70 13.53 —1.16 
1933 58 - 26810 374.42 100.38 96.10 4.28 7.70 -56 
1932 44 . 29552 343.81 101.60 98.40 3.20 11.21 -29 
1931 34 . 27207 391.41 106.49 118.18 | —11.69 9.14 —1.28 
1930 25 38306 353.53 135.42 116.37 | 19.05 19.55 97 
1929 32 -37959 363.52 137.99 129.65 8.34 11.57 72 
1928 29 42777 315.03 134.76 121.69 13.07 11.63 1.12 
1927 33 44542 269.53 120.05 112.53 7.52 10.67 oa 
1926 16 .33777 185.96 62.81 64.62 — 1.81 3.82 — .47 
1925 9 .37549 177.88 66.79 65.47 1.32 4.10 32 
1924 17 -42269 151.01 63.83 57.86 5.97 4.49 1.33 
1923 3 -33840 140.76 47.63 56.66 — 9.03 3.89 —2.32 
1922 6 -56092 87.57 49.12 36.38 12.74 3.25 3.92* 
1921 & prior 38 -49779 346.04 172.25 168.43 3.82 18.80 -20 
Total Sample | 1,015 $9,586.81 ($2,029.09 ($2,009.84 19.25 47.17 Al 





























* Note that the estimate of the standard deviation becomes very unreliable when the number of 
policies is small; for example there are only 6 policies in the sample for the issue of 1922. 











422 AMERICAN STATISTICAL ASSOCIATION JOURNAL SEPTEMBER 1948 


below and 13 above the absolute value 0.6745, compared with the 
expected 12.5 in each group. If the 25 ratios are classified into 20 per 
cent frequency intervals from the lower to the upper limit of the dis- 
tribution, the successive intervals contain 5, 3, 5, 8, 4 ratios as compared 
with the 5 expected in each interval. This agreement seems satisfactory. 

It will be noted that the stratification experiment was based entirely 
on the sample drawn before considering stratification. Consequently 
this is not an indication of the best result that might be attained from 
efficient stratification. If the sample had been drawn originally with a 
view to stratification, it would have been selected in such a way as to 
minimize the standard deviation of the final estimate. This would have 
required drawing more cards in some of the years of issue where the 
reserve is heavy and a smaller number of cards in other years of issue 
where the reserve is very light. For example, there were 71 cards in the 
issue of 1945 where the total reserve is only $29 million but only 32 
cards in the issue of 1929, which has a reserve of $130 million. Conse- 
quently the issue of 1945 contributed only*17 million million to the 
total variance, while the issue of 1929 contributed 134 million million to 
the total variance. 

For a given total number of cards in the over-all sample, the selection 
by years of issue should be made on a basis to minimize the variance 
of the resulting estimate. The variance was recomputed approximately 
on the basis of the most efficient distribution of the sample by years of 
issue, assuming that the estimated variances and correlation coeffi- 
cients among individual policies would not change within the years of 
issue. A new value of 1,692 million million was obtained for comparison 
with the 2,225 million million in the actual sample. These figures indi- 
cate that for this distribution a given degree of accuracy could be 
attained with about one-quarter fewer cards in the sample if it were 
stratified properly by years of issue to minimize the variance. The cor- 
responding standard deviations would be $41.1 million compared with 
$47.2 million. 

On the basis of stratification by year of issue, but without reflecting 
the most efficient distribution of the sample, Table IV has been pre- 
pared for Metropolitan Regular Ordinary business, giving the approxi- 
mate number of policies that would be required in a sample in order to 
provide various degrees of accuracy and risk. The amount of error is 
shown down the left margin both as a percentage and as an approxi- 
mate amount of money based on the $2.0 billion reserve in this block 
of business. The amount of risk is shown across the top of the table. 
For example, if a sample of 80,000 policies were drawn time after time, 
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TABLE IV 


APPROXIMATE SIZE OF SAMPLE FOR VARIOUS DEGREES 
OF ACCURACY AND RISK* 

















Error in Reserve Risk of Exceeding Designated Error 
50% 10% 5% 1% 0.1% 
Amount %, 
(Millions) ” , » eked 
Approximate Number of Policies Required in Sample* 
$2 0.1 230,000 1,100,000 1,400,000 1,950,000 2,500 ,000 
5 0.25 39,000 220,000 310,000 500 ,000 760 ,000 
10 0.5 9,700 57 ,000 80,000 140 ,000 220 ,000 
20 1 2,500 14,500 21,000 35,000 57 ,000 
40 2 610 3,600 5,100 8,800 14,500 
80 4 150 900 1,300 2,200 3,600 











* Adjusted for sampling from finite universeof 4.27 million Regular Ordinary policies in the Metro- 
politan Life Insurance Company. 


and if the reserve were estimated from each such sample, the error 
would not exceed 0.5 per cent or $10 million in more than 5 per cent of 
such estimates. It might be said also that the error would not exceed 
0.25 per cent or $5 million in more than one-third of such estimates. 


CURRENT APPLICATIONS 


The exploratory work reported above gives an indication of the pos- 
sibilities of sampling procedures in reserve valuations. The question 
naturally arises as to what extent these techniques are actually in use. 

As « matter of fact, sampling has been used very little, if at all, for 
any major reserve valuation but it has been used in a number of in- 
stances for smaller items which would be very troublesome to obtain 
from a coraplete inventory. One large item of this nature is the deferred 
premium asset for Metropolitan Regular Ordinary business. This asset 
arises from the fact that a very considerable proportion of premiums 
are paid semi-annually, quarterly, or monthly. Since the reserve lia- 
bility is computed for simplicity on the assumption that all premiums 
are paid annually, it is necessary to calculate an adjusting item for 
the balance sheet. 

The item of deferred premiums depends entirely on the distribution 
of premiums according to the month of issue date and according to the 
mode of payment (annual, semi-annual, quarterly or monthly). Ob- 
viously this distribution does not change very rapidly with time, and 
the results from a single sample can be carried forward for a few years. 
The problem is to obtain from a sample the estimated average deferred 
premium at the end of the calendar year per dollar of gross annual 
premium in force, together with the variance of the estimate. A 2 per 
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cent random sample amounting to 91,011 cards was drawn for the most 
recent estimate in 1946. For this sample, the average deferred premium 
per dollar of gross annual premium in force was $0.1964, with the stand- 
ard deviation indicating an error range of about 1 per cent of the 
average at the two sigma level. Thus, in an item of $50 million, there is 
a risk that about once in 20 years, on the average, the error in the esti- 
mate will be more than $0.5 million. Although the valuation of this one 
item involves a very substantial amount, it is very small in relation to 
the reserves for the total business of the Company. 

Another end-of-year adjustment item, ‘‘Due and Unpaid Premiums,” 
arises from premiums not paid on the due date because policies are not 
transferred to the non-forfeiture accounts until sometime after expira- 
tion of the “grace” period. This item is much less stable than the de- 
ferred item because it depends considerably on current economic psy- 
chology in addition to the proportion of gross annual premium due in 
November and December. The average premium due and unpaid at 
the end of 1946 amounted to $0.0496 per dollar of gross annual premium 
in force, giving a total amount of about $12 million. Because this item 
may change rapidly with time, and is likely to be influenced by Christ- 
mas buying, a new estimate is required each year at the end of Decem- 
ber. The amount of premiums due and unpaid at the end of 1947 was 
estimated from a sample of 100 district averages selected from the 
finite universe of 787 district averages in the total Company. Sampling 
among these averages rather than among individual policies is much 
less efficient statistically, but it has been adopted because of certain 
filing advantages. For the same accuracy, about ten times as many 
cards are required to obtain the sample district averages as would be 
used in a random sample of individual policies. 

The Metropolitan uses a sampling method for the reserve of nearly 
$13 million on Additional Insurance purchased by the application of 
dividends. A sample is taken of all the policies with issue date anniver- 
saries in two different months of the year, giving about 35,000 policies 
with Additional Insurance. For each of nine subdivisions, the amount 
of Additional Insurance and the amount of reserve on this insurance is 
obtained from the sample. Then the average reserve per dollar of in- 
surance is obtained. The total reserve in the nine subdivisions is esti- 
mated by multiplying corresponding averages by the total amount of 
Additional Insurance for each subdivision. The result from one sample 
is carried forward from one year to the next by an accumulation proc- 
ess which reflects the new dividends applied as single premiums to 
purchase Additional Insurance, the interest required on the reserve, and 
adjustments for terminations. The sampling process is repeated every 
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three to five years as a check and correction on the accumulation proc- 
ess. 
GENERAL CONSIDERATIONS 

A question will arise as to why sampling methods have not been used 
more widely for valuation purposes, and to what extent such methods 
may be used in the future. 

The answer hinges on the degree of accuracy demanded in the result. 
With respect to reserve valuation the experimental results presented 
indicate that a sample of very reasonable size, 21,000 policies for ex- 
ample, would give substantial assurance that the estimated result 
will be accurate to 1 per cent, but a 1 per cent error in the reserve of 
$2 billion on the Regular Ordinary business becomes a $20 million 
error. Neither Company management nor State Insurance Depart- 
ments would permit errors of this size. Although an error of $20 mil- 
lion is clearly larger than could be tolerated in practice, the amount 
of error might be reduced within certain limits by increasing the size 
of the sample. In any sampling plan it is necessary to determine the 
tolerance limits before proceeding to determine the size of the sample. 

The reserve is based on very broad assumptions even if it is com- 
puted arithmetically with complete correctness, the assumptions being 
in regard to the probabilities of death and in regard to the rate of 
interest. A very moderate change in one of these assumptions may make 
a very substantial absolute change in the total reserve even though the 
percentage change may be relatively small. For example, if it is decided 
to assume that the Company will earn an interest rate of 0.25 per cent 
less in the future than previously had been assumed, this change in 
assumption may well add some $60 million to a reserve of $2 billion. 
Considering the approximate nature of the basic assumptions under- 
lying the reserve calculation, a statistician might well inquire whether 
the degree of arithmetical accuracy now demanded is really necessary. 
There are other important practical considerations, however, which 
must be taken into account. Let us inquire into the source of the re- 
quirement and into the extent to which such accuracy is really needed. 

Section 205 of the current New York Insurance Law provides in part, 
“The Superintendent shall annually value, or cause to be valued, the 
reserve liabilities (hereinafter called reserve) for all outstanding life 
insurance policies and annuity and pure endowment contracts of every 
life insurance company doing business in this state, . . . In calculating 
such reserves, he may use group methods and approximate averages for 
fractions of a year or otherwise.” This last sentence may be interpreted 
to permit the use of sampling techniques. In practice, however, the 
New York Department of Insurance is reluctant to approve approxi- 
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mate methods of valuation unless it can be demonstrated that the error 
is small. Moreover, there is a tendency to consider the error as an ab- 
solute amount rather than as a percentage. 

After the Company makes the valuation required by law, it for- 
wards its working papers to the State Insurance Department for further 
checking and review. The formal transmittal to the New York Insur- 
ance Department is accompanied by an affidavit signed by the Presi- 
dent, the Vice-President and Chief Actuary, and the Assistant Actu- 
aries in immediate charge of the work, and reads in part: 

“That the valuation schedules, summaries and other valuation 
data .. . have been submitted to the State of New York Insurance 
Department as the sole basis for calculating said Company’s valua- 
tion results and issuing its Certificate of Valuation; 

“That such schedules have been properly and correctly prepared 
from the policy and annuity valuation records and that such 
records accurately and completely cover all of the policies and 
annuities of the said Company outstanding and paid-for on the 31st 
day of December, 19... ; 

“That such schedules contain, (1) complete and correct group- 
ings according to age, issue, duration, kind and amount of all such 
policies and annuities, with all the data necessary for making a 
mean, grouped valuation of the same upon the net premium basis 
as required by law, and (2) full and complete information necessary 
for calculating all special net reserve liability items, including 
reserves for disability and accidental death benefits; 

“That the data for the said mean, grouped valuation produces 
results equal to those required by the terms of the policies and 
annuities and the law applicable thereto: 

“That the above named Actuary has charge of the Company’s 
policy and annuity valuation records and of the valuation sched- 
ules and summaries, and has supervised the preparation of the 
schedules and of the Analysis of Valuation hereto attached and 
bearing his signature, and that the checks placed on the same 
insure their accuracy; and 

“That the total amount of reserves on said policies and annui- 
ties is correct as reported on such schedules, summaries and other 
valuation data and as set out in the said Analysis of Valuation, 
according to their best information, knowledge and belief.” 

It is very difficult to base a complete valuation on sampling methods 
and still submit such a specific affidavit even if the sample were in- 
creased to an impractical size sufficient to eliminate almost all uncer- 
tainty. 
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Aside from the legal requirements, Company management could not 
tolerate an error of estimation which would become significant when 
carried into the surplus, and which would become more significant in 
the surplus earnings of a single year. An error of $1 million in the re- 
serve, in opposite directions at the beginning and at the end of the 
year, would result in an error of $2 million in the surplus earnings of the 
single year. The amount of annual surplus earnings, the starting point 
for the distribution of dividends to policyholders, is about $80 million 
for the block of business on which the experimental results of this dis- 
cussion were based. It is clear that the amount of acceptable error due 
to sampling fluctutions is limited severely by the necessity for accuracy, 
consistency, and continuity in this important item. 


FUTURE APPLICATIONS 


It appears likely that the immediate extension of sampling techniques 
in reserve valuations will be limited to the reserves on related benefits, 
including certain paid-up policies, on which the total reserve is so small 
that an error of the order of 1 per cent does not look too large in rela- 
tion to the surplus earnings in a single year. 

One can only speculate in regard to the future development of sam- 
pling techniques for the more important valuation of the main life in- 
surance reserves, running into billions. Application to independent 
checking and auditing might well be a possible development of sampling 
techniques. State Insurance Departments frequently consider the prob- 
lem of reconciling the active file of renewal cards with the separate book 
record in force classification accumulated over the years from the trans- 
actions and used as the basis for the arithmetical calculation of the 
reserve liability. The cost of a complete inventory would be excessive, 
not to mention the difficulty of “freezing” the very active file of renewal 
cards long enough to carry out the work. The exploratory methods used 
in this discussion, with some extension and with a reasonably large 
sample, could provide an independent check on the total reserve which 
would detect any error of sufficient size to be significant when carried 
into surplus. 

Progress may be expected in developing sampling techniques for the 
main life insurance reserve, perhaps by the indirect approach of sam- 
pling transactions rather than by direct frontal attack. It probably will 
take years of study, however, before sampling methods will become 
practicable and generally acceptable for the main life insurance reserve 
liability in the Annual Statement. 








RECENT DEVELOPMENTS IN GRADUATION AND 
INTERPOLATION* 


Tuomas N. E. GREVILLE 
National Office of Vital Statistics, U. S. Public Health Service 


Some developments during the 10-year period 1938-1947 in 
the techniques of graduation, or the smoothing of data, and of 
interpolation, are reviewed and summarized. In preparing this 
summary of developments in graduation and interpolation 
during the past decade, it has been the author’s intention to 
refer to the more important results which have come to his 
attention, and which he considers are likely to be of interest 
to practical statisticians and actuaries who have occasion to 
perform graduations or interpolations of actual data. No at- 
tempt has been made to deal with the more abstract topic of 
interpolatory function theory. Within the limits thus set, 
there has been no intention to slight anyone, and the author 
will appreciate having his attention called to any omissions. 
For derivations, proofs, and more detailed treatment, the ref- 
erences given in the bibliography may be consulted. 


I. GRADUATION 


haracteristic function of a graduation formula. Graduation (or 
“smoothing”) has been defined as “the process of securing from an 
irregular series of observed values of a... variable a smooth regular 
series of values consistent in a general way with the observed series of 
values” which “is then taken as a representation of the underlying law 
which gave rise to the series of observed values” [17].! A traditional 
method of graduation is based on the use of weighted moving averages.” 
This method may be represented by a formula of the form: 


F. = > YrLin—» (i) 
where y, denotes the sequence of given data to be smoothed, F, is the 
“smoothed” value of y, and the numbers L, are the numerical coef- 
ficients which characterize the formula. In practice, of course, both the 
given data and the coefficients L, are limited to a finite, range. 

In an elegant study of this method of graduation, Schoenberg [18] 





* A paper presented at the 107th Annual Meeting of the American Statistical Association, New 
York City, December 30, 1947. 

1 Numbers in brackets refer. to the bibliography at the end of this article. 

2 This method is also referred to as graduation by “adjusted averages” or by “linear compound- 
ing.” Prominently associated with its early development are the names of De Forest [28] and Sheppard 
[19, 20]. 
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has defined a “characteristic function” for each such smoothing for- 
mula, and has shown that the properties of the formula are uniquely 
related to certain properties of this function, which thus uniquely 
“characterizes” the smoothing formula; so that, knowing the function, 
one can deduce the formula, and vice versa. Specifically the charac- 
teristic function of the given sequence is defined as 


Ke} 


T(u) = pm yne™, 
where u is an arbitrary parameter. Similarly the characteristic function 
of the smoothing formula is defined as 


no} 


¢(u) = >» L,e™. 


nti=—o 


By multiplication, it is evident that 
T(u)o(u) = DO Fre, 


so that the characteristic function of the “smoothed” sequence is the 
product of the characteristic functions of the given data and of the 
smoothing formula. If the coefficients L, are symmetrical in the sense 
that L_,=L,, the characteristic function of the formula can be ex- 
pressed in the form of the Fourier series: 


o(u) = Lo +2 >> La cos nu. 
n=l 

In terms of the characteristic function, Schoenberg proposes a cri- 
terion for stating when a formula of this type is really a “smoothing” 
formula in the sense that its application may be expected actually to 
increase the smoothness of the data. It has been customary to judge 
the smoothness of a graduation by the size of some specified order of 
finite differences of the graduated values. Using the same approach, he 
establishes the relations 


(an) = —f- (2 sin du) | T(u) \*du, 
) 1 Qn 
™m 3 = = \ 2m | 2 2 
ZA F,) > J (2 sin 3u)?™ | T(u) | | (u) | du, 


where m is any nonnegative integer. Observing that the integrands of 
these two expressions differ only by the factor | o(u)|2, he defines a 
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smoothing formula as one for which | o(u)| <1, for OSuS2z, with the 
additional stipulations® 


> La=1 and >> |L,| < . 


n=—oo n=—co 


Application of orthogonal polynomials to graduation formulas of mazi- 
mum smoothness. A smoothing formula is said to be exact for the degree 
m if it reproduces exactly the values of any polynomial of degree not 
exceeding m. In this connection, Schoenberg shows that a formula of 
the type (1) is exact for the degree m if and only if ¢(u) —1 has at u=0 
a zero of order m+1, or, in other words, if this expression and its first 
m derivatives all vanish for u=0. 

If L_,=L,=0 for n>k, where L; and L_; are not zero, the formula is 
said to be of span 2k+1. If each of 2k+1 consecutive ungraduated 
values is assumed subject to a random error with zero mean and vari- 
ance the same for all the 2k+1 values, then the ratio of the mean 
squared error of the graduated value to that of the ungraduated value 
is 


Therefore particular interest attaches to those formulas which minimize 
this ratio for a given span and degree reproduced. These are said to 
effect the maximum “reduction of error,” and are sometimes referred 
to as formulas of “maximum weight.”* Those formulas for which 


k 
Dd (A*L,)? 
n=—k-—8 

is @ minimum have frequently been called formulas of maximum 
smoothness. However, it must be admitted that the choice of third 
differences for this purpose is somewhat arbitrary, and that an argu- 
ment for the use of some other order of differences could be advanced 
with equal justification. It has long been known [7, 1, 16] that the 
computation involved in performing a graduation by a formula of max- 
imum error reduction can be considerably simplified by the use of a 
set of orthogonal polynomials associated with the name of Chebyshev. 





8 Schoenberg limits the definition to formulas with symmetric coefficients, but it appears to be 
equally applicable to those with unsymmetric coefficients. 

4 This is, of course, a different use of the word “weight” from that employed when a “weighted 
average” is referred to. For a fuller discussion of the principles underlying these formulas and the for- 
mulas of maximum smoothness next mentioned, see [17, Chapter 4]. 
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GRADUATION AND INTERPOLATION 
The Chebyshev polynomials 7';;(x, n) are given by the formula: 
T(x, n) = A‘[xy(a — n) gy], 


where 2:3) denotes the expression x(z—1) + + - (x—#+1)/i! These poly- 
nomials are orthogonal over the summation range 0 to n—1: in other 
words, we have: 


n—1 
> T(x, n)T (x, n) = 0 
z=0 
if 7~j, while 
n(n? — 1)(n? — 4) +++ (n? — 7?) 


(2¢ + 1)(7!)? 


If we now consider the smoothing formula for obtaining a graduated 
value of yayz which involves the given values ya to Ya;n-1, inclusive, 
is exact for the degree r, and gives maximum reduction of error, it is 
well known that the coefficient of ya,, is given by 


= T(z, n) T(z, n) 


| - > S.(n) 


This remarkably simple expression for the coefficient results from the 
orthogonal property of the polynomials T(x, n). As fairly extensive 
tables of the values of 7';(z, n) and S;(n) have been published [3], this 
formula makes it easy to compute linear compound coefficients, not 
only for the symmetrical “mid-panel” formulas used over the greater 
part of the range of given data, but also for unsymmetrical “end- 
panel” formulas to be used near the ends of the range. Greville [13] has 
recently shown that the coefficients for the formulas of maximum 
smoothness also can be expressed in terms of the Chebyshev poly- 
nomials. Using the same notation as in formula (2), with the addition 
that m denotes the order of differences taken as a criterion of smooth- 
ness, the coefficient of ya;, is® 





D [tile, »)}* = Sin) = 


(2) 





— aPrae, n--m A"T';(z, n+-m 
Lea=(—1)™(e-+m)™(2—n)™ ee 
iam (m+72) @~™S;(n-+m) 
where 2™ denotes z(z—1) - + + (g—m-+1). 
Difference equation graduation. For a number of years, a favorite 








5 This formula represents a slight improvement over the one given in the aforementioned article. 
For derivation of this formula, see the author’s reply to the discussion of [13], Record, American Institute 
of Actuaries, Voi. 37, Part I, No. 75, April 1948. 
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method of graduation among American actuaries has been the “differ- 
ence equation” method developed by Whittaker [27] and Henderson 
[14, 15]. This method results from minimizing the expression: 


b b—m 
> (uz — yz)?» +k Zz (A™uz)?, 


where yz denotes the given values, which are assumed to be known 
over the range x=a to x=b, uz denotes the graduated values, m is a 
specified order of differences, and k is a parameter to be chosen by the 
graduator. The first term in the above expression is a measure of fidel- 
ity to the given data, while the second is a measure of smoothness, and 
the parameter k enables the graduator to assign whatever relative 
weight he desires to these two requirements. It turns out that the 
graduated values satisfy the difference equation: 


Uz + (—1)"k8?™u, = y:, (3) 


where 6 denotes a difference taken centrally. This equation holds for 
a<zx <b, with the understanding that A"u, =0 for z<aand forz>b—™m. 
Early methods of solving the difference equation (3) subject to these 
end conditions produce only approximations to the graduated values 
near the beginning of the sequence, or involve successive approxima- 
tion [14, 15], or else involve extensive computation [21]. Weaver [25] 
published in 1943 an abbreviated method of solution which gives the 
exact values directly. His method is based, in essence, on modern 
“short-cut” methods of solving simultaneous linear equations [8]. Let 


Ve = CYz + Ch,2-W2-1 + Co,2-Ve-2 + °° * + Cm,z—-mVz—m (4) 
Uz = Co,2Vz + C1,2Ur41 — Co,2Uz49 + ie: + (—1)"Cm,2Uz+m; (5) 


where vz denotes a sequence of “partly graduated” values intermediate 
between uz and yz, the coefficient c of y, is a constant, and it is under- 
stood that those “c” coefficients are zero which would be applied to 
“y’s” with subscripts less than a or to “u’s” with subscripts greater 
than b. Expanding the differences in equation (3) in terms of the uz 
values themselves gives a set of simultaneous equations connecting 
the “u’s” and “y’s.” Another such set of equations is obtained by 
solving equation (5) for vz and substituting the resulting expression for 
all the “v’s” in equation (4). Equating coefficients in the two sets of 
equations gives a set of relationships between the “c” coefficients from 
which these coefficients can be successively determined. The possibility 
of expressing the relationship between the “w’s” and “y’s” in the form 








oop epemenspar: 
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of equations (4) and (5) depends on the fact that the matrix of the 
coefficients in the simultaneous equations is symmetrical about the 
principal diagonal. Spoer! [23] has previously described the application 
of this method to the solution of simultaneous equations derived from 
difference equations. 

Spoerl [22] has also generalized the difference equation method of 
graduation by considering the so-called “mixed difference case,” in 
which the expression to be minimized is taken as 


b b—2 b—3 
» (Uz — yz)?» +h ps (A?uz)? + k >» (A®uz)?, 


where h and k are both arbitrary parameters. 
II. INTERPOLATION 


Systematization of osculatory interpolation formulas. One of the most 
common practical problems in interpolation is that which is sometimes 
called the problem of subtabulation. This is illustrated by the case in 
which values of the rate of mortality, or some other actuarial function 
depending on age, are given for every fifth or tenth year of age, and it 
is desired to supply the intermediate values for each single year of age. 
The traditional method of interpolation by means of a moving poly- 
nomial arc of degree kK—1 determined by k consecutive given values 
gives rise to troublesome discontinuities when applied to empirical 
data. This long ago led actuaries to alter the conditions imposed on the 
polynomial ares by decreasing the number of given points which the 
curve is required to pass through, but requiring the continuity of first 
derivatives, and, in some cases, higher derivatives, of the composite 
interpolation curve [29]. Greville [10] gave, in 1944, a systematization 
of the formulas of this type, which enables one to develop rapidly the 
formula, if any, satisfying given predetermined conditions as to span, 
orders of derivatives required to be continuous, degree of the poly- 
nomial arcs used, exactness to a specified order of differences, and 
reproduction of the given values. 

For example, it is shown that a linear compound formula for inter- 
polation in equal intervals by means of a composite interpolation curve 
made up of spliced polynomial arcs and having a continuous first 
derivative can be expressed in the form 


Yns2 = [22(1 + 2x)(Ao — Bi) + zB, — 42276°B, ly, 
+ [x2(1 + 22)(Ao — By) + xB, — 42°26?By lyn 
+ 2*2?P(y, x) (6) 
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where 0<2<S1, z denotes 1—z, 6 denotes a difference taken centrally, 
A» and B, are finite difference operators of the form 


Co + 167 + 254+ ---, 


and P(y, x) is any linear compound of the “y’s” having polynomials in 
x as coefficients. Suppose it is desired to obtain a 4-term formula using 
third-degree polynomial arcs, correct to second differences and repro- 
ducing the given values. Since the polynomial arcs are to be of the third 
degree, we must have P(y, x)=0. Reproduction of the given values 
means that Ap=1; to have a 4-term formula, B; must be limited to the 
first term Cp only; and for correctness to second differences co must have 
the value unity. Substituting these values in equation (6) gives the 
well known Karup-King interpolation formula. 

On the other hand, if it should be desired to forego the requirement 
that the given values be reproduced but to require instead that the 
polynomial arcs be of the second degree, Ao is of the form 1+k8. 
In order for the polynomials to be of the second degree we must have 
Ao=(1+46?)B:. Since B:=1, that means that Ao=1+ 46? which gives 


Yara = (2 + 4275*)yn + (a + 3276") yay, 


which is Jenkins’ second difference smoothing interpolation formula, 
a well-known formula which not only interpolates but, at the same 
time smooths the given data. 

Characteristic function of an interpolation formula. Generalizing his 
previously mentioned study of graduation, Schoenberg [18] has also 
developed a characteristic function for interpolation formulas ex- 
pressible in linear compound form, so that the interpolatory properties 
of the formula can be inferred from certain properties of the character- 
istic function. Let the interpolation formula be 


F(z) = DO wL(x — »), 


where the numbers y, represent the given data, F(x) denotes the inter- 
polated value, and L(z) is called the “basic function” of the interpola- 
tion formula. The characteristic function® is defined as 


g(u) =f Leayerar 





* A function g(u) defined in this manner in terms of another function L(z) is sometimes referred to 
as the Fourier transform of L(z). 
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If, as is usually the case, L(x) is an even function, this may be written 
in the form 


g(u) = f ” 60 cos uxdx. 


Schoenberg then shows that the interpolation formula is correct to 
kth differences if g(u) —1 has a zero of order k+1 at w=0 and, at the 
same time, g(u) has zeros of order k+1 for w=2zn, where n is any in- 
teger other than zero. He'also shows that the formula reproduces thé 
given values if and only if 


> g(u + 2rv) = 1. 

Incidentally, he further shows that the basic functions of the well- 
known interpolation formulas can be expressed in a remarkably com- 
pact form through the use of Fourier integrals. For example, Jenkins’ 
well-known fifth-difference smoothing interpolation formula is given 


by: 
L(x) = 2 f ‘- mv — $ cos ue dr 
Oe —~o\ $0 ; 


Minimized difference interpolation. Beers [4] has suggested that, as a 
practical matter, in most actuarial and statistical applications the in- 
terpolator usually is interested in interpolated values only for certain 
discrete, equally spaced arguments. For example, in the case of actu- 
arial functions depending on age, it is usually only the values for in- 
tegral ages that are of importance. Accordingly, he argues that the 
consideration of continuous curves with continuous derivatives is ir- 
relevant and unnecessary, and proposes the development of tables of in- 
terpolation coefficients in linear compound form designed to minimize, 
on the average, the sum of the squares of a specified order of finite 
differences of the interpolated values, subject to stated requirements 
as to span, correctness to a specified order of differences, and reproduc- 
tion of given values. Thus, in his method the interpolated points are 
not thought of as lying on any particular polynomial or other curve, 
and no mention is made of derivatives or their continuity. 

For example, in the case of a 6-term formula correct to fourth differ- 
ences and reproducing the given values, if it is assumed that any two 
consecutive fifth differences of the given data are independently dis- 
tributed random variables with zero mean and equal variances, the 
interpolation coefficients for subtabulation in fifths which minimize, on 








436 AMERICAN STATISTICAL ASSOCIATION JOURNAL SEPTEMBER 1948 






































TABLE I 








INTERPOLATION COEFFICIENTS FOR BEERS’ MINIMIZED FIFTH-DIFFERENCE 
FORMULA FOR SUBDIVIDING THE INTERPOLATION INTERVAL INTO FIFTHS 


PART A--TO BE USED FOR THE FIRST TWO INTERVALS 





Coefficients of yz To Obtain: 








y.s va y-s y-s Yi,3 Ys Yi.e Vie 
sda neen eee’ +.6667 +.4072 +.2148 +.0819 -—.0404 —.0497 -—.0389 —.0191 j 
Pisckdaadn naehead +.4969 +.8344 +1.0204 +1.0689 +.8404 +.6229 +.3849 +.1659 
Miccknpeweced ments —.1426 —.2336 —.2456 —.1666 +.2344 +.5014 +.7534 +.9354 
Me hek-skutéstivksmadaes —.1006 -—.0976 -—.0536 —.0126 —.0216 —.0646 -—.1006 —.0906 
Giretinatagenanes +.1079 +.1224 +.0884 +.0399 -—.0196 -—.0181 -—.0041 +.0069 
Miisarkaucewamahenaes —.0283 —.0328 —.0244 -—.0115 +.0068 +.0081 +.0053 +.0015 





PART B—TO BE USED EXCEPT FOR THE FIRST TWO 5 
AND LAST TWO INTERVALS 





Coefficients of yz To Obtain: 








Yn+.2 Un+.4 Yn+.6 Yn+-8 
A ade tadaeaesa +.0117 + .0137 — .0087 + .0027 
BREE ry ore pe — .0921 —.1101 +.0771 —.0311 
is ait ecuineae Marea ae + .9234 +.7194 + .4454 +.1854 
Ser a ere + .1854 + .4454 +.7194 + .9234 
| ERS Aee greene — .0311 — .0771 —.1101 — .0921 
I e-rbt mere pnine + .0027 + .0087 + .0137 +.0117 : 





PART C—TO BE USED FOR THE LAST TWO INTERVALS 





Coefficients of yz To Obtain: ) 








Ye—1.8 Ye-1.6 Ye-1.4 Ye-1,1 Ye—.t Ve—.t Ye. Ye—.3 
Ps osindsewnae -- +.0015 +.0053 +.0081 +.0068 -—.0115 -—.0244 —.0328 —.0283 
Picakeceniacene +.0069 -—.0041 —.0181 -—.0196 +.0399 +.0884 +.1224 +.1079 : 
ee were —.0906 —.1006 —.0646 -—.0216 —.0126 -—.0536 —.0976 —.1006 4 
i sctcctasasas +.9354 +.7534 +.5014 +.2344 —.1666 -—.2456 —.2336 —.1426 
i reer +.1659 +.3849 +.6229 +.8404 +1.0689 +1.0204 +.8344 +.4969 
De br evescisoeoess —.0191 —.0389 —.0497 —.0404 +.0819 +.2148 +.4072 +.6667 





the average, the sum of the squares of the fifth differences of the inter- 
polated values are those given in Table !. While, on theoretical grounds, 
a fairly high negative correlation between successive fifth differences 
would be expected, it will commonly be fouad in practice that the 
actual behavior of the fifth differences of empirical data approximates 
closely enough to the assumptions stated so that Beers’ formula yields 
plausible results with a highly satisfactory degree of smoothness. 

Tables of interpolation coefficients deduced on the same or s_milar 
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principles have been published by Beers [4, 5, 6], Greville [11], and 
Wells [26]. 

Relationship of interpolation to summation graduation. Vaughan [24] 
has shown that there is an intimate connection between linear com- 
pound subtabulation and the “summation” formulas of graduation 
which have received much attention from English actuaries. The latter 
are a special type of linear compound graduation formula, in which the 
relationship between the graduated and ungraduated value is of the 





_ tolal fr] 
Pp q Tr ee 
= f(8)yz, 
pqr sé 6 
where p, g, 7 - + - are positive odd integers, [p] is an operator denoting 


the summation of p consecutive given values symmetrically situated 
about the value to which the operation is applied, and f(6) is an 
“operand” of the form 1+¢,;6?+c2.64+ ---, which is usually chosen 
so as to make the graduation formula correct to a specified order of 
differences. This type of formula was of greater interest and impor- 
tance before computing machines came into general use, as the use of 
summations in the formulas then resulted in a substantial short cut 
in computation. 

Let a subtabulation formula for dividing the intervals between given 
data into k parts be 


Uz = > Lig-vl/oh 
Then, there corresponds uniquely to this interpolation formula the 
graduation formula 
1 C) 
U,z = ry > , a om 


Vaughan shows that the latter formula can be expressed in the form 
Uz = [k}f(8)¥., 


where n is the order of differences to which the interpolation formula is 
correct. Moreover, the terms of f(5) as far as 6" must be such that the 
entire graduation formula is correct to nth differences. As the number 
of possible terms in f(5) is limited by the span of the interpolation 
formula, this leaves, in general, only a small number of undetermined 
coefficients. This principle therefore provides an alternative method of 
derivation of interpolation coefficients satisfying such criteria as Beers 
has proposed. 
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Interpolation by applying a least square criterion over the entire range. 
Vaughan [24] and Greville [12] have considered independently a some- 
what different approach to the subtabulation problem. In this ap- 
proach, instead of expressing each interpolated value in terms of a few 
neighboring values, the object is “to ascertain definitely the inter- 
polated terms which will have over the whole series the smallest possi- 
ble sum of squared differences of any given order” [24]. It is shown 
that in this case the given and interpolated values are connected by the 
relation: 


E-F(E) 
f(E*) 


where k is the interval of subtabulation, EZ denotes the displacement 
operator defined by Eyz=yz41, 


Yorks (7) 


Yorp-z = 


(Ek? — B-t/2)2m 
F(E) = (8) 
(Eu2 — F-1/2)2m 





(m being the order of differences to be minimized), and f(E*) is an 
expression consisting of only those terms in the expanded quotient of 
formula (8) in which the exponent of E is a multiple of k. The solution 
of the difference equation (7) is facilitated by introducing an intermedi- 
ate sequence vz which vanishes except when z is an integral multiple 
of k, such that 


Yok+s = E-*F (E)vyx, 


and 
f(E*)v,n = Yor (9) 


The difference equation (9) may be solved by a method analogous to 
that of Weaver previously described, or by expanding the function 
1/f(é) in a power series of the form: ao+a,(t+¢-) +a,(#@4+f°)+ ---. 

Interpolation by “quadratic cross-means.” Another type of practical 
interpolation problem is that in which only a single isolated value, or 
only a few such values, are desired. Probably the best method of es- 
timating such values for empirical data is the classical one of fitting a 
polynomial of degree k to k+1 given values. One of the most effective 
methods of carrying out the computation for such interpolation is the 
iterative process given in 1932 by Aitken [2], which has recently been 
improved by Feller [9], who demonstrates Aitken’s basic process in the 
following manner. Let it be required to compute the value f(#) of a 
polynomial of the nth degree, f(x), given fe=f(xx) for K=0,1, +--+, 7. 
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fo to-& 
f(z) z«-8& 


and note that it is a polynomial of degree n—1, and that f(£) =f(é). 
Hence we are required to find f(£) knowing 


fo Yo—& 
fe t—E& 


for k=1, 2, - +--+, . Thus the problem has been reduced to a similar 
problem in which the number of given values is one less, and in like 
manner it is further reduced to n—2, and so on. 

The improvement introduced by Feller arises in interpolating by 
“quadratic cross-means.” The latter process, which originally appeared 
to work only for an even number of data, can now be equally well 
adapted to an odd number. However, it is still necessary that the given 
data be symmetrically placed with respect to some point s=a. The es- 
sential feature of the method consists in regarding the interpolation 
polynomial as a polynomial of lower degree in the square of the inde- 
pendent variable. This device reduces by about one-half the number 
of steps in the iterative process. Denote two symmetrically placed 
points by x; and z_x. The point x»>=a is included among the data only 
if m is even. Consider 


We define {M%(a) = + (% — 2) 








fe? = 








> (XE — Xo) 














_ f(Qa—z) 2a-—-xz-—é a _ 
| ; ¢(x) = i(a) wail + 2(x — a) 
an 
_ —f(2a-—2) 2a—x-ét - ' 
¥(z) = f(a) out + 2(2 — a). 


Obviously ¢(x) and (xz) are even functions of z—a, and hence poly- 
nomials in ¢=(z—a)*. Moreover, ¢(¢) =¥() =f(é). 

If n=2m—1, the problem is reduced to finding the value for ¢= 
(—a)? of the polynomial of (m—1) th degree ¢(a+-+/2), given its values 
fo 2t4-—§& 
fi m—é& 
for t= (2,—a)*, k=1, 2, --+,m. 

On the other hand if n=2m, we compute the values 
—f.s t4-—& 

fre Ze — § 
for t=(2,—a)*, k=0, 1, - - 


+ (XE — Xx) 


o = 





+ 2(§ — a) 





pel 


- , m, and proceed as before. 
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SAMPLING ERRORS IN MORTALITY AND OTHER 
STATISTICS IN LIFE INSURANCE* 


Dona.tp D. Copy 
The Equitable Life Assurance Society of the United States 


The types, sources, and extent of data entering into the ana- 
lytical statistics in life insurance companies are outlined. The 
statistical errors in mortality investigations arising as a result 
of the limited extent of the death claim experience or arising 
from the use of exposure sampling techniques are discussed. 
Comments are also made about the experience rating problem 
in group insurance and about a possible application of sequen- 
tial analysis to the checking of files. 


TYPES, SOURCES, AND EXTENT OF DATA 


CTUARIES are concerned with a number of different types of statis- 
tical rates and probabilities. The most important of these are the 
mortality rate, the rate of occurrence of permanent disability, the rate 
of termination of such disability by recovery or death, the accidental 
death rate, the rate of claim under accident and health policies, the rate 
of claim under hospitalization and medical reimbursement policies, and 
the rate of termination of policies. The sources of data for making es- 
timates of these rates and probabilities are the company’s files, which, 
in some cases, are designed primarily for furnishing such data, and in 
other cases, are designed primarily for other purposes. A description of 
certain files from which data for life insurance mortality investigations 
are taken in the Equitable Life Assurance Society will serve as an il- 
lustration. 

The basic statistical data regarding death for a life insurance mor- 
tality investigation come almost exclusively from the death claim file, 
which is a file of I.B.M. punched cards prepared from the claim rec- 
ords filed when a policy becomes payable by death. A specimen card is 
shown in Fig. 1. Each calendar year some 16,000 to 18,000 of these cards 
are added to these files. 

The primary source of information concerning the exposure to the 
risk of death in large over-all mortality investigations of standard 
medically examined business comes from another file of I.B.M. punched 
cards which is used primarily for the annual valuation of the company’s 
reserve liability. This file is summarized mechanically to give the num- 
bers and face amounts of insurance on policies in force on December 31 





* A paper presented at the 107th Annual Meeting of the American Statistical Association, New 
York City, December 30, 1947. 
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FIGURE 1 
DEATH CLAIM DETAIL CARD 


of each calendar year; these are classified according to rating group, 
plan of insurance, calendar year of issue, and age at issue. The data 
from this file must be adjusted to differentiate policies with some spe- 
cial attribute which makes them non-homogeneous with the bulk of 
the policies, e.g., policies issued without a medical examination. There 
are now about 2,000,000 cards in the valuation detail card file of the 
Equitable, and the file is increased by a net addition of 50,000 to 
100,000 cards each year. However, the summary cards (see Fig. 2 for 
a specimen) actually used number less than 100,000. 
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FIGURE 2 
SUMMARY CARD—INSURANCE IN-FORCE 
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It is also necessary to keep a special mortality class file of I.B.M. 
punched cards, which are made up from application papers at issue, in 
the case of special policies such as those involving medical impairments 
and occupational hazards (see Fig. 3). There are about 1,250,000 cards 
in these files, representing all policies issued since 1925 to policy holders 
with such impairments or occupations. This may seem like a rather 
large number of cards, but since about 400 medical impairment classes 
and 250 occupational classes have been studied in the last twenty years, 
the number of exposures and the number of deaths in some of the classes 
are very small. 
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FIGURE 3 
MEDICAL AND OCCUPATIONAL STUDY DETAIL CARD 


These files are so constituted that data on the main medically exam- 
ined standard business and on all the special classifications can be com- 
- bined with such data from the larger life insurance companies to give 
over-all intercompany experience. This pooling of data is made, an- 
alyzed, and published by the Joint Committee on Mortality of the 
Actuarial Society of America and the Association of Life Insurance 
Medical Directors. 


THE ANALYSIS OF MORTALITY AND OTHER STATISTICS 


(a) Sampling errors in death claims. The primary problem facing the 
actuary is the efficient estimation of the true underlying probabilities 
from the statistics available. With respect to the large bulk of standard 
medically examined business, such large numbers are involved that 
sampling errors in the deaths themselves at each age and policy year 
are relatively small. The only serious complication arises from the fact 
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that the primary interest is in the financial effect of claims; it is cus- 
tomary, therefore, to analyse the mortality rates for the most part in 
terms of the amounts of insurance exposed and the amounts of claims 
paid. However, it is a relatively simple matter to take into account the 
effects of variation in the amounts of insurance upon the standard 
deviations used to test the reliability of the statistics [1]. 

For instance, the standard deviation in the mortality rate by 
amounts, defined as the ratio of the total amount of death claims in 
dollars, over the total amount of insurance exposed in dollars at age z is 


Riqz 


- = 
/ ae i 6, 
E, 
where 


} sly > sl 
l, l, 


dz = mortality rate by amounts = 








Ul 


0: 





$s = amount of insurance in dollars on individual life (mid- 
point of class interval) 


l; = number of lives exposed with policies of s dollars 


L= Dok 
E,= > sl: 


6. = total actual death claims in dollars. 


A typical size for Rz is 2. 

The process of graduating a crude series of values of the rate of mor- 
tality has the effect of reducing the variance of the individual rates of 
mortality, but this process is so thoroughly described by T.N.E. 
Greville elsewhere in this issue that mere reference is sufficient here [2]. 

On the other hand, for the investigation of the death rates in special 
mortality classifications, there are important problems involving the 
design of categories in which the data are analysed, the combination of 
data within a number of independent categories into relatively homo- 
geneous groupings large enough to give statistically reliable results, and 
the determination of reliable measures of dispersion: on these problems, 
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only slow progress is being made. We are proceeding gradually in the 
redesign of the classifications themselves as a result of our growing 
knowledge of the salient attributes of each impairment or occupation. 
The redesign probably can most efficiently be carried out by a working 
group of a doctor to supply the necessary clinical information, an under- 
writer to supply information concerning the acceptance of each risk, 
an actuary to supply information concerning the monetary effects, and 
a statistician to provide a proper appreciation for homogeneous and 
significant categorization in the light of previous statistics. 

(b) Sampling errors due to exposure sampling. So far, the discussion 
has only been concerned with sampling errors in the estimation pro- 
cedures arising from the limited size of the experience from death 
claims. In some instances, because of the cost or time involved in 
bringing up to date a file needed for determining exposures required in 
some particular estimation problem, at the present time consideration 
is being given to taking, say, a 10 per cent sample of all policies issued 
in a particular period by pulling out record cards with policy numbers 
ending in a particular digit. This sample will be studied for the point 
under investigation, and the records characterized by this point will be 
broken down into the necessary number of smaller relatively homoge- 
neous categories, thus giving the exposure to risk of death in each such 
category. These exposures will then be combined with a 100 per cent 
sample of the corresponding death claims from the death claims file; 
the mortality rate is then the ratio of death claims to the exposure 
estimated from the sample, viz., 


) 
a 
where q=mortality rate 
6=deaths 
E=total exposure as estimated from the sample exposure 
(sE) 


100 s=percentage sample used in determining exposure 


In the case of an investigation with large exposures in each category 
with lives used as units (in lieu of amount units), the variance of the 
mortality rate determined by such a procedure is larger than the vari- 
ance in the mortality rate determined from a 100 per cent exposure 
sample, in approximately the ratio 
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In the event that q involves a number of ages, and the exposure is taken 
over a period of years so that each life may enter into the exposure a 
number of times at different ages, this ratio is increased by a serial cor- 
relation effect to the approximate value 


T(1 — s)g 
s(1 — g) 


where @ is a type of weighted average over-all mortality rate and 

T is an average period of exposure per life in the category. 
The introduction of amounts of claim for the purpose of estimating the 
mortality rate by amounts further complicates the problem. This gen- 
eral problem may be found discussed in the Transactions of the Actu- 
arial Society of America [8, 4]. 


THE EXPERIENCE RATING PROBLEM IN GROUP INSURANCE 


A very important type of insurance policy today is the group policy, 
which insures under a single contract the health, retirement, and death 
benefit needs of all employees in certain employment categories of a 
single employer. Under this type of coverage there is a very complicated 
statistical estimation problem involved in determining the claim 
charges (credits in the case of annuities) to be assessed against each 
employer in the annual net cost. In order to establish proper equities 
and to assure relatively stable cost to the employer from year to year, 
it is necessary to base these claim charges partially on each employer’s 
own experience and partially on the over-all experience in the particu- 
lar premium group under which the employer’s contract is classified. 
Naturally, the larger the number of employees and the older the em- 
ployer’s contract, the more weight can be given to the employer’s own 
experience. 

One type of formula used in the case of group life insurance for the 
claim charge against a particular employer is the following: 


P 
K 
AaP aa? 


kP 








where P=total premiums received from the employer during a stipu- 
lated period of time 
K =actual claims in the employer’s group during the same period 
k=ratio of total death claims to total premiums over the whole 
group life insurance experience for the same period 
A=constant chosen usually by practical considerations 
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This function in the case ef K and P measured on a “lives” basis has 
been determined theoretically by simple probability considerations by 
R. Keffer [5]. Mr. Keffer assumes that the ratios, r, of the mortality 
rates for each group to the mortality rates for all groups combined are 
independent of age and are distributed according to a Pearson Type 
III curve as follows: 

(m + 1)e~™Dr[(m + 1)r] 
y = 


m! 





where m is a parameter related to A in the previous paragraph. Since 
the probability of death is small, the Poisson Law gives the probability 
of experiencing d deaths in a group governed by a ratio r as 


e~"¢(rc)4 
d! 
where c is the expected number of deaths based on the experience of all 


groups. Therefore, the probability of the ratio being of value r and of 
there being d deaths is 


(m + 1)™*1¢4(m + d)! e~@t1+9r[(m 4+ 1 + c)r]m*4 
Z= ° e 
“(im +1 + o)™emid! (m + d)! 
Theoretically m could be determined by fitting this function to the com- 


pany experience, although the resulting value might prove impractical. 
The mean value of r is then found from this to be 


m+1+d 

m+l1+ece 

so that the mean value of the deaths in the group is 
m+1 

os a 

m+l1+e m+il1l+e 


which may be used as an estimate of the average number of deaths 
properly assignable to the group for cost purposes. If we set m+1= Ak, 
c=kP, d=K on the assumption that each life is insured for unity, 
then 





f= 


P 
Fc kP + ——— K as set forth above. 
A+P 


~A+P 


There is a need, however, for a reconsideration of the theory of this 
problem, taking into account distributions of exposures and deaths by 
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age and amounts of insurance in the whole premium classification and 
in the individual group. Functions of a stochastic variable in the 
theory of risk as developed by Professor Cramér and others of the 
Scandinavian actuarial school may prove to be useful tools in obtaining 
insight into this problem [6]. It should also be mentioned that the 
casualty insurance actuaries have given considerable attention to this 
general problem [7, 8, 9]. 


THE CHECKING OF FILES 


An interesting problem exists in connection with the checking of 
such files as the special mortality class files from which basic statistical 
data are drawn. It is desirable to have some trustworthy means of con- 
firming the accuracy of the files without checking the whole back 
against ponderous original records. We are concerned with errors of 
various levels of importance, such as, for instance, the entries of deaths 
and withdrawals. We are currently examining the feasibility of setting 
up acceptance criteria to be applied by a binomial sequential analysis 
procedure [10] to a special mortality class file of some 500,000 I.B.M. 
punched cards. 

We have in mind sorting out sequential samples in groups consisting 
of all cards in a particular impairment group characterized by an im- 
pairment code punched in the cards. Such a sample will involve cards 
into which entries of transactions should have been made over the 
period involved in our investigations. Each card will be compared 
with basic records with respect to all transactions such as, for instance, 
death and withdrawal, and a tabulation will be kept of the number of 
cards on which each type of transaction should have been made and the 
number of errors found in each type of transaction. In the case of 
death transactions, for instance, we feel that we can sustain an error 
of the order of 2.5 per cent of all death transactions without seriously 
biasing our conclusions, bearing in mind that sampling errors exist in 
any case by reason of the limited extent of our data. Our criteria might 
then consist of keeping below 2 per cent the probability of deciding 
the file needs complete rechecking when the actual errors are below 
1 per cent and of keeping below 1 per cent the probability of accepting 
the file as satisfactory when the actual errors are above 5 per cent. The 
average sample number is below 300 for these particular criteria, so 
that probably no more than 600 transactions would have to be inves- 
tigated before the test procedure terminated in the acceptance or re- 
jection of the file. Depending on the percentage of death transactions 
among the cards sampled, which is largely a function of the period of 
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years covered, we would hope to investigate no more than 10,000 to 
20,000 cards out of the 500,000 cards. If the file does possess the desired 
degree of accuracy, this procedure appears to offer an economical 
means of confirming the fact. Of course, if too many errors are present 
in the file, an extensive effort is indicated to correct it. 
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THE EDGE-MARKING METHOD OF ANALYZING DATA 


L. L. THURSTONE 
The University of Chicago 

About thirty years ago the writer introduced a simple card 
method of analyzing data. The method has been used by many 
students and others. It is especially adapted to problems with 
experimental populations of not more than about four hundred 
cases in which it is desired to analyze the relations of a large 
number of attributes. The number of attributes can be rather 
extensive, up to three hundred if necessary. The writer’s 
5 by 8 cards for edge-marking have been reprinted many times. 
Most of the users of these cards are familiar with only one or 
two applications. The purpose of this paper is to describe 
enough examples to illustrate the versatility of the edge-mark- 
ing method. 


RIGINALLY the provocation for devising the method was the fre- 

quent necessity for analyzing data on populations of several 
hundred subjects in situations where the punched card methods were 
not available or where they were too expensive. This was often the 
case with students who could not afford the expense of punched card 
methods and in which it was often questionable whether the size of the 
problem would justify the punched card procedures. When the number 
of attributes exceeded the number of columns on the punched card, it 
was necessary to resort to double punching with consequent complica- 
tions in handling the analysis. Methods have been available for punch- 
ing holes or slots in the edges of cards. These methods enable one to 
handle a large number of variables and they are often very useful. 
Special equipment is necessary for their use which is often not avail- 
able for students who work on problems of limited size. ‘ihe card 
method that we have been using increases still further the number of 
attributes that can be recorded on each card because we use pencil 
marks instead of holes along the edges of the card. This makes possible 
the use of both sides of the card for different attributes. The total 
edge length that is available is therefore twice the perimeter of the card. 
An ordinary 5 by 8 card provides 52 inches of edge length along which 
to record data about the individual who is represented by the card. Sta- 
tical problems can be handled by a student alone without special 
equipment beyond the printed cards. For many years the edge-marking 
method has been known asthe “slant-cut” card system, but the slant- 
cut on one corner of the card has only incidental reference to the meth- 
od of statistical analysis with these cards. The cut in one corner of the 
card serves the same purpose as the corner-cut on the usual I.B.M. 
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punched cards, namely to make conspicuous any card that has been 
inadvertently turned over. oe 
fF In Fig. 1 we have a diagram of the 5 by 8 card. Each card is first 
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identified with the name of the individual subject who is represented 
by the card. Around the edges of this card there are 130 short lines, 
65 on each side of the card. Each line extends to within a short distance 
of the edge of the card, and each line is identified with a number. A 
master list is prepared showing the attribute that is represented by 
each code number. It is often desirable to represent the presence or 
absence of each attribute. This is usually done by black and red pencil 
marks which continue the printed line to the edge of the card. When 
information is lacking about an attribute, the corresponding space is 
left blank. Accentuation of the attribute can be represented by a double 
red line, or a double black line, at the appropriate space on the edge of 
the card. These ideas can be varied indefinitely to suit particular 
problems. We have found that it is not desirable to use many colors. 
We ordinarily limit the use of color to red and green together with 
black. In some problems the several gradations, or class-intervals of a 
variable, can be represented by several colors. In other cases, where 
there is plenty of room on the card and the number of variables is not 
large, several spaces are provided for the different gradations or class- 
intervals of each variable. Again, these decisions are adapted to each 
problem. 

Perhaps the most common application of the edge-marking method is 
in the preparation of four-fold tables for a large number of attributes. 
As an example, suppose that each attribute has been given a code 
number and that the presence of an attribute is denoted with a red 
mark while the absence of the attribute is denoted with a black mark 
at the edge of the card. For this example we shall assume that only the 
code numbers 23 to 34 inclusive have been used in the positions shown 
at the top of the card in Fig. 1. 

When a card has been prepared for each subject in the experimental 
population, let us start with the tabulation of all four-fold tables that 
involve attribute 23. The cards are stacked and then fanned so as to 
expose the edges as shown in Fig. 2. The cards are then separated ac- 
cording to the edge marks in the position 23. All cards with black marks 
at this position are sorted into one pile, and all cards with red marks are 
sorted into another pile. We shall assume here that there are no blank 
entries. Each of the two piles is then fanned so as to expose the edges 
as shown in Fig. 2. The cards that represent negative answers to item 
23 will show a sold black line at position 23. The other stack of cards 
to the right will show a solid red line in the same position. In the figure 
we are using cross hatching to represent the red color. The cards with 
the solid red line at 23 represent the individuals with positive answers 
to item 23. 
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FIGURE 2 


The two piles are separated slightly in stacking them so as to make 
the code numbers visible for both piles as shown. The figure extends 
only to item 34. A data sheet is prepared as shown on the right side of 
the figure. This table is prepared for recording the four cell frequencies 
for each four-fold table. The first column shows the number of black 
marks in each row of the first pile. The second column shows the num- 
ber of red marks in each row of the first pile. The third column shows the 
number of black marks in each row of the second pile, and the fourth 
column shows the number of red marks in each row of the second pile. 
Consider a particular example as shown in the four-fold table at the 
botton of the figure. This table shows the association between the at- 
tributes 23 and 24. The association is evidently negative in that those 
who have attribute 23 tend to be lacking in attribute 24. This fact can 
also be seen by inspection of the row of edge marks at the level 24. It is 
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immediately apparent that the proportion of black marks is larger for 
the second pile than for the first pile at the level 24. Since there are as 
many as twenty marks on a single edge of the card, it is evident that 
one can inspect the relations of attribute 23 with as many as twenty 
other attributes in one position of the cards. Twenty four-fold tables 
can be prepared from this position of the cards. 

When the cards have been inspected, or counted, for all the attri- 
butes on one edge, the two piles can be turned to each of the other 
edges and the process can be continued until all of the four-fold tables 
have been prepared for attribute 23 which determined the separation 
of the cards into two piles. There are eight edges to be thus examined, 
four on each face of the cards. The cards can then be sorted into two 
piles according to each other variable that is to be analyzed. 

Sometimes it is desired only to inspect the cards without counting 
the actual frequencies. After separating the cards into two piles accord- 
ing to presence or absence of a particular attribute such as a criterion 
measure, One may inspect the cards in search for those other attributes 
that correlate with the criterion measure. If the proportion of red and 
black marks is about the same in the two piles, one may skip the com- 
parison without bothering to count the actual frequencies. If we do 
that for the example of Fig. 2, we find, for example, that attribute 26 
has some relation to 23 even before counting the red and black marks. 
When we come to 28, we see that this attribute has a striking relation 
with 23 because there are many black marks on the left but only one on 
the right. Similarly, item 34 shows some relation to 23 because of the 
relatively greater density of black marks on the right. This example 
illustrates the use of the edge-marking method for inspectional purposes 
where only some of the frequencies are actually counted. In attaching 
significance to the relations found, it must of course be remembered that 
significance is determined in part by the total number of relations that 
are inspected. 

In the present example we have assumed that there were no blank 
entries on the data to be recorded. If the original data contain blanks 
for some of the items, these can be represented by merely omitting the 
edge mark for the items in question. In some problems, it is of interest 
to know which questions draw blanks in the answers, and sometimes 
the blanks are considered to be significant as evasions in response to 
some questions. The blanks can be inspected as to relative frequency in 
the two groups of subjects that are being compared. When the cards 
are stacked as shown in Fig. 2, the relative frequencies of the blanks in 
the two piles can be inspected and counted if they are significantly 
different. 
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We shall list here a number of typical uses of the edge-marking meth- 


The preparation of four-fold tables for any number of pairs of 
variables. These can be prepared in a fraction of the time that 
would be required by ordinary methods of tabulation and 
counting. 


. The preparation of tables of multiple classification. These are 


the cases in which the variables are classified into more than two 
categories or class-intervals for the calculation of Chi square and 
other indices. 


. The inspection of data for questionnaires to discover significant 


association between variables without counting all of the fre- 
quencies. 


. The item analysis of questionnaires, psychological and educa- 


tional tests, application blanks, and other schedules against some 
criterion or dependent variable. 


. The reduction of a large number of attributes or variables to a 


limited number of groups or clusters. This procedure is especially 
useful when the problem is to determine the dimensionality of a 
domain that is represented by a large number of discrete items. 
The composite scores for the clusters can then be studied by 
factorial methods. 


. The calculation of composite scores for a group of attributes. 


Each individual score may be the number of attributes in a group 
of attributes which are positive in that individual. This calcula- 
tion is done with a scoring frame to be described. 


. The scoring of an individual performance when the subject 


classifies or sorts cards. In a factorial study of perception the 
writer included several tests of the optical illusions.! These were 
arranged on cards which the subject was asked to sort to the right 
or to the left according to the apparent magnitudes in the dia- 
grams on the cards. The scoring was greatly simplified by edge- 
marking. The examiner merely picked up the stack of cards 
which the subject had placed in the right-hand compartment 
and turned them over. The number of right responses was 
counted by the edge marks that were on the backs of the cards 
so that they were not visible to the subject. 

When the examiner turned over the cards, he arranged them 
so as to expose the edges as shown in Fig. 2. He placed the stack 








1L. L. Thurstone, A Factorial Study of Perception, The University of Chicago Press, Chicago, 
1944, tests 27-31, inclusive. 
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of cards on a data sheet as shown at the right of the figure. A 
frequency distribution was then made by counting the number 
of edge marks in each row. This procedure gave immediately the 
distribution of magnitudes because each card was edge-marked 
on the back side at a position which designated the magnitude 
of the right-hand figure in the optical illusion on that card. The 
cards were then shuffled for the next subject. 

This work of scoring was done in a fraction of the time that 
would have been required by ordinary methods of tabulation. 
The Minnesota Multiphasic test has also been scored in this 
manner with edge marks on the backs of the cards. These are 
made by the examiner to simplify his work, but the edge marks 
could be printed on the cards by the manufacturer. 

. Bibliographic cards can be edge-marked to very great advan- 
tage. In every field for which a bibliography is assembled, there 
are a number of categories or subheadings. Each journal article 
or monograph is represented by a card. Instead of preparing 
duplicate cards for cross reference purposes, one card can be 
edge-marked on the back, or on the front, to show the several 
categories for which the article is relevant. If it is desirable to 
collect all the references in a particular category, one need only 
assemble all of the cards that have an edge mark in a particular 
space. When the compilation has been completed, the cards can 
be re-sorted into the general alphabetical file. This method 
saves the labor of duplicating cards or of writing explicit cross 
references for each card. When bibliographic cards are prepared 
and used by an investigator, it is often a matter of considerable 
importance to reduce the clerical labor to a minimum. A system 
that is cumbersome and which requires much writing is likely to 
be discontinued. 

. A practical application of the same kind can be made on a mail- 
ing list. If every person on the mailing list is represented by a 
separate card, it can be edge-marked to indicate the categories 
to which the name belongs and the kinds of material that should 
be mailed to that person. The fact that certain material has 
actually been mailed to a group of individuais can be indicated 
by edge-marking a stack of cards with a single pencil stroke at 
the appropriate space on the cards when they are fanned so as 
to expose all of the edges. The mailing of different types of ma- 
terial can be controlled from one master card file. These methods 
are not necessarily the most appropriate in large offices. 
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The filing of library cards and the checking of the file could be 
facilitated by edge-marking on the cards. The library classifica- 
tion could be represented by a series of marks on the edges of the 
cards so that the marks would be visible when the cards are 
stacked tight in a drawer. Any card that is misplaced would be 
immediately seen because its markings would not be continuous 
with those of adjacent cards. In this case it should not be neces- 
sary to fan the cards because the marks could be printed on the 
edges when the cards are stacked. This could be done by the 
printer when the cards are manufactured. The writer has sug- 
gested this application to librarians but the method evidently 
has not yet been tried. 

If it is desired to present a set of cards in a uniform order to all 
subjects, the desired order can be represented by edge-marking 
either on the front or back of the card. When each subject has 
finished his categorization of the cards, they can be re-sorted by 
means of the edge-marking into the original order for the next 
subject. The alphabetized order can be represented by edge- 
marking because the edge-marking is perceived more quickly 
than the aphabetical order. 

When paired comparison data are to be tabulated, the proportion 
of judgments, 7 >k, in which 7 is preferred to k, can be facilitated 
by edge-marking. This adaptation can be made in several ways 
depending on the original form of the tabulations. 

A test form or preference schedule can be so arranged that the 
subject’s responses are made at the edges of a card, the central 
portion of which is devoted to the text. The subject’s responses 
can then be analyzed by the edge marks which he has made him- 
self. This procedure saves additional time because the edge- 
marking is then already made by each subject. 


For some types of statistical problems we have used a rack for hold- 
ing the cards. This application can be illustrated by an example in 
which'we used the larger card, namely, 103 by 12 inches. This was the 
largest size of card that could be filed in a vertical filing case of standard 
dimensions. In using the large size card it is best to have a rack on which 
to place the cards for the kind of comparison that was described with 
Fig. 2. The rack consists of a board that is mounted at an angle of 
about sixty degrees with the surface of the table. At the lower edge of 
the board is a ledge on which the cards rest. Two or more groups of 
cards can be arranged on this board for inspection of the edge marks. 


When using the rack for counting frequencies, especially on the large 
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cards, the work is facilitated by using a long flexible ruler or preferably 
a horizontal rule that can be clamped in position for each row when the 
frequencies are being counted or inspected. 

The use of the scoring frame will be described in terms of a recent 
study in the Psychometric Laboratory at The University of Chicago. 
A schedule of 330 questions had been assembled on temperamental 
characteristics. The subject was asked to respond to each question with 
one of three responses, namely, yes, no, or a question mark. The 330 
responses were edge-marked on a single card for each subject. The 
assumption was made that the large collection of questions represented 
a much smaller number of factors, dimensions, or types. The purpose 
of the study was to make a factorial investigation of the dimensionality 
of the domain that was represented by the entire list of questions. For 
this purpose, it was desirable to discover first the main groupings of the 
items and to make the grouping in such a manner that all of the items 
in each group should be as closely related as possible. 

The items were first arranged in tentative groups according to pre- 
liminary interpretations. Then we wanted to assign to each subject a 
composite score for each group of items. For example, one group of 
items was concerned with rate and amount of activity. A composite 
score for such a group of items would be the subject’s self-rating on 
general activity. A very active person would be expected to have a high 
composite score while a calm and deliberate or phlegmatic person would 
be expected to have a lower rating on this characteristic. Such a com- 
posite rating does not imply that the trait will be revealed as a primary 
factor. It might turn out to be a combination of other more general or 
fundamental factors. Before such a composite rating can be accepted 
even for preliminary study, we should have some assurance that the 
several individual items in each group are highly correlated with the 
composite score for the group. It was for this type of analysis that we 
wanted a composite score for each subject in each group. Those items 
which did not show close relation with the composite score were to be 
eliminated from the group and other items would be added by inspec- 
tion of the edge marks. 

The composite scoring frame is represented diagrammatically in 
Fig. 3. For simplicity of explanation, we shall assume that all of the 
items in the group to be analyzed are represented on the upper and 
right-hand edges of the front face of the edge-marking card. The scor- 
ing frame has the four sides A which enclose a rectangular space equal 
to that of the edge-marking card. One of the individual cards is placed 
into this rectangular space. The upper and right-hand edges of this 
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card are shown at B in Fig. 3. A master card C is prepared for each 
group of items. This card is slightly smaller than the edge-marking 
card. The master card in the figure has been moved in the direction D, 
toward the lower left corner of the frame. Whe. the master card is 
moved toward the lower left corner of the frame, th« edge marks be- 
come visible along the upper and right-hand edges as shown in the 
two edges B. Let the cross hatched marks represent red edge marks on 
the individual card. 

Let us suppose that nine items have been grouped in the belief that 
they are closely related. These nine items are represented by edge 
marks on the master card C’. Red and black edge marks have been made 
on the master card according to whether positive or negative responses 
to the items represent presence or absence of the postulated general 
trait. When the master card is in the position shown in Fig. 3, one can 
readily count the number of items on which the subject responded in 
the expected manner if he excelled in the group trait that is represented 
by the master card. The composite score for the individual subject is 
the number of items on which he responded in a manner similar to that 
which is marked on the master card. In the example of Fig. 3 we find 
that the individual subject agreed with the master card on seven of the 
nine questions. On the top edge, for example, we find that the subject 
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made two positive responses and one negative response that agreed 
with the master card. The responses at F and H agree with the master 
card while the response at G disagrees with the master card. The slant- 
cut at E is also made on the master card so as to insure that the master 
card and the individual card are properly matched. 

When the count for the composite score has been made, the result 
is recorded on the individual card. The arrow K on the master card 
indicates where the composite score for this group of items should be 
recorded on the individual card. The composite score 7 is now recorded 
on the individual card. The same procedure can be followed for com- 
posite scores on other groups of questions or items. The composite 
scores for different groups are, of course, recorded in different positions 
on the individual card. When all of the composite scores have been 
recorded on the edge-marking cards, these can be tabulated in a fre- 
quency distribution. The median composite score is then determined. 
The cards can then be marked with red and black to indicate whether 
the composite score for an individual is above or below the median or 
mean. The cards are then ready for the preparation of four-fold tables 
to show the relation between the individual item responses and the 
composite score for each group. Similar analyses can be made for 
quartiles or for any other groups. Those items that do not agree with 
the composite can be eliminated and other items can be discovered by 
inspection that should be included instead. 

In the present example, we have assumed that all of the items in a 
group were on the two edges of the card which are simultaneously visi- 
ble. If the items spread over more than the two edges, the master card 
is moved in the direction E. That exposes the edge marks along the left 
edge and the bottom edge of the card. If the group items extend over 
both faces of the card, then the individual card and the master card 
are turned over so that the counting can be continued on the second face 
of the card in the same manner. The master card is then also marked on 
both faces. These procedures are much easier to show with actual 
manipulation than with the awkwardness of verbal description. Item 
analyses of various kinds can be quite easily handled with these edge- 
marking methods. 

The edge-marking cards can be marked directly from the original 
data. When the data are arranged in columns, it is convenient to place 
the edge of the card next to a column of data and to transfer the column 
of records directly to the adjacent edge of the card. In order for this 
procedure to be feasible, the spacing of the columns on the original 
data sheets should be the same as that of the edge-marks on the card. 
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One of the writer’s students, Mr. Albert Hunsicker, made a practical 
application by using blank cards instead of the printed cards. He 
adopted a spacing for the edge-marking that matched the spacing on 
his data sheets. The writer has used blank cards in the same marner, 
The card shown in Fig. 1 was arranged to fit double spaced typewriting, 
If the original data sheets have such spacing, then the data can be trans- 
ferred to the edge-marking card directly by placing the edge of the 
printed card next to the column of data. This method of making the 
edge marks saves much time and avoids clerical errors. 

In making the edge marks it is advisable to avoid a soft pencil be- 
cause such pencil marks are likely to rub off to adjacent cards when 
they are handled. We have used red and black India ink to advantage 
on cards which are to be handled for a large number of tabulations. A 
rather fine pen point seems to be adequate for edge-marking. If the 
cards were to be manufactured in large quantities, they might be made 
with a border that is slightly thinner than the rest of the card. The edge 
marks would be made on this slightly thinner border and they would be 
less likely to rub off onto adjacent cards even with extensive handling. 
The difficulty is avoided by using a medium-hard pencil or by using ink. 

The flexibility and economy of the edge-marking method of classify- 
ing and analyzing data justify more general use than it has had in the 
past. The method is especially useful for those problems in which the 
investigator works either alone or with some clerical assistance and 
where tabulating equipment for punched card methods is not readily 
available. For some problems with relatively small populations and 
with many attributes for each individual subject, the writer prefers 
the edge-marking method to the punched card procedures because the 
significant relations can be identified by inspection without actual 
counting and tabulation. The several examples of this paper are in- 
tended to illustrate the versatility of edge-marking which can be 
adapted to a, great variety of purposes. 
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PART II 
CHARACTERISTICS OF THE WPB REPORTING STRUCTURE 


An understanding of the scope of the WPB reporting struc- 
ture demands more than a knowledge of its development as 
described in the first article of this series. Further insight into 
its intricacy and breadth is given by an analysis of its quanti- 
tative and qualitative characteristics. 


NUMBER OF WPB QUESTIONNAIRES 


UMERICAL statements concerning WPB questionnaires must be 
N viewed with caution. The numbers of requests for data, and the 
figures purporting to reveal their character, have serious limitations as 
units of measure. Reporting forms varied in length and complexity 
from a multipage form, such as PD-25A, to a simple registration state- 
ment requesting only a name and an address. Some complicated forms 
were often little used by management and simple questionnaires fre- 
quently had a high administrative value. Such factors should be kept 
in mind in appraising the following analysis. 

From March 1941 to September 1945, a total of 4,083 different forms 
were approved for issuance by the WPB. This figure includes a total of 
2,397 repetitive or recurring forms, and 1,686 “one-time” forms. These 
figures do not take into account the “wild-cat” forms submitted to in- 
dustry without formal approval, those forms continued by other gov- 
ernmental agencies and trade associations for use by the WPB, forms 
submitted for formal approval but disapproved (the disapproval rate 
was about 10 per cent of the forms submitted), or the unknown but 
substantial number of forms that were discussed and then withdrawn 
by the originating unit before formal submission. The total figures 
compare with 309 questionnaires issued by the War Industries Board in 
World War I. 
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QUANTITATIVE ANALYSIS OF REPETITIVE FORMS IN USE 
IN DECEMBER 1943 


A quantitative analysis of the recurring questionnaires in existence 
at one point of time gives an enlightening cross-section view of the 
WPB reporting structure. The study reveals the character of data used 
by the WPB management since continuous surveys were, for the most 
part, the instruments upon which administrative action was founded. 
The techniques used on forms at one point of time, however, should 
not be considered representative of the entire war period. Certainly 
the discussion in the previous article indicated the hazards in drawing 
such a conclusion. 

The following study of questionnaires in use in December 1943 was 
derived from an Analysis of the Use of WPB Forms prepared by the 
General Statistics Staff of the WPB and published in December 1943.! 
During December 1943 there were in use a total of 707 recurring forms. 
It is useful to compare this total to counts of repetitive reports in use at 
various points of time. The numbers were: September 1942, 609; 
January 1943, 586; May 1944, 903; and August 1945, 442. 

In the following study a number of one-time and irregular reports 
are added to the 707 repetitive questionnaires in use in December 1943 
to make a total of 786 forms analyzed. The 786 forms are classified into 
the following uses: statistical, 508; application, 305; scheduling, 48; 
and other, 95. Included in the “other” category are registrations, cer- 
tificates, bills of material, reports of surplus stock, and appeals. Since 
one form often had several uses the total number of these 786 forms 
amounted to 956. 

WPB questionnaires and reporting instruments were predominantly 
of three types: statistical, application, and scheduling. Statistical ques- 
tionnaires were those designed to obtain information for use in policy- 
making and in serving as guides to action. They were not vehicles for 
administrative action. Applications were primarily designed to serve as 
instruments for authorization by the WPB to industry to take certain 
action. Most of these applications required the submission of data 
that could be used for statistical purposes. The WPB learned that 
reporting was facilitated on a quid pro quo basis, that is, when the sub- 
mission of data resulted in authority to industry to do something. 
Scheduling forms were those designed primarily to obtain information 
on the dates at which a manufacturer proposed to ship a designated 





1 See also Journal of The American Statistical Association, Vol. $9, No. 226, 1944, pp. 144-154; 
“Wartime Facts and Peacetime Needs,” by Vergil D. Reed. 
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item to a specifically identified customer. On the basis of such reports 
the WPB “froze” or modified schedules. Many questionnaires embodied 
all three of these characteristics. 

The type of data requested on statistical forms is indicative of the 
sort of information which the WPB felt it needed for operations. The 
following summary derived from an analysis of the 508 forms used for 
statistical purposes, shows the number of forms requesting the type of 
information specified: 


Capacity 70 
Production: 

Actual 204 

Scheduled 111 
Shipments 271 
Consumption 181 
Receipts or accepted deliveries 178 
Inventory 339 
Orders: 

Unfilled 147 

New 119 

Cancelled 52 
Requirements 102 
End Use Pattern 128 
Other 151 


The frequency of return of all 786 forms and the 508 statistical forms 
is summarized as follows: 








Frequency of Statistical 
Return All Forms Forms 
One-time 79 42 
Daily 1 0 
Weekly 3 2 
Semi-monthly 8 6 
Monthly 404 351 
Bi-monthly 3 0 
Quarterly 88 71 
Semi-annual 1 0 
Annual 1 1 
Irregular 198 45 





The proportion of the total maximum number of respondents filing 
each form was important for statistical reports. The extent of coverage 
of the 508 statistical forms was as follows: complete coverage, 383; large 
firms only, 55; sample of large, medium, and small firms, 22; and cov- 
erage unknown, 48. 
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Analysis of the type of respondents shows some of the broad areas 
covered by the WPB’s reporting. Of the 786 forms, 430 were filed by 
manufacturers; 88 by other types of processors, such as converters, 
tanners, refiners, foundries, and mills; and 159 by “consumers” who 
were largely, but not entirely, manufacturers filing forms as consumers 
of raw materials. At the distribution level there were 121 forms filed 
by other classes of respondents such as warehouses, distributors, whole- 
salers, importers and retailers. There were 6 surveys the respondents of 
which were utilities; 15 covered government agencies, and 85 covered 
other groups. (The total of groups is 904 and results from the fact that 
one form often covered several groups.) 

The number of respondents filing WPB questionnaires varied from 
one to hundreds of thousands. Form PD-789 (producers monthly re- 
port of fibrous glass production), for example, was filed by one com- 
pany. Preference-rating certificates, such as PD-1A, and various other 
application forms were filed by thousands of individuals, plants, and 
companies. Contrary to popular impressions the average number of 
respondents to most reporting forms was relatively small. The follow- 
ing summary table shows that the number of respondents was fewer 
than 100 for more than half of all statistical forms. An astonishing num- 
ber of forms was collected from 25 or fewer respondents. Only 54, or a 
little more than 1 per cent of the total, were submitted by more than 


1,000 respondents. 
TABLE I 








Number of Respondents Classified by 
Reporting Periods 











Number of One 
Rempendente Time Monthly Quarterly Other Total 

N.A. 5 10 4 8 27 
10 and less 1 31 4 7 43 
11-25 4 46 5 8 63 
26-50 2 53 6 8 69 
51-100 4 60 13 6 83 
101-200 3 48 10 5 66 
201-500 6 48 10 3 67 
501-5000 14 39 15 9 77 
over 5000 3 8 2 0 13 
TOTAL 42 343 69 54 508 




















Most of the questionnaires were related toa WPB order, such asan L 
or M order. Of the 786 forms, 178 were not directly related to orders; 
the remainder, 608 forms, were related to more than 300 different or- 
ders. 
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WPB REPORTS FILED BY A MEDIUM-SIZED PLANT 


It is informative to study the quantitative characteristics of the 
WPB reporting system by analyzing the reports filed by a single indus- 
trial plant.* For this purpose a medium-sized plant which employed 
approximately 5,500 persons and produced electric motors, various 
types of electrical equipment and miscellaneous war products was 
selected for study. Reporting requirements for various products were 
so diverse, however, that no one plant should be considered represen- 
tative of all plants. 

A total of 57 reporting forms were chosen. Excluded from the study 
are comparatively insignificant and irregular reports, telephonic re- 
quests for data, informal letters requesting information, various au- 
thorizing forms which were not submitted directly by the plant, and, of 
course, reports filed with the Aircraft Scheduling Unit, Board of 
Economic Warfare, War Department, Navy Department, and other 
such procurement agencies, whether the form did or did not carry a 
WPB form number. Inclusion of these items would, of course, result in 
a much greater total number of questionnaires and surveys actually 
submitted by this plant. 

Table II shows the frequency throughout the war period with which 
selected categories of information were requested on the 57 forms. In 
December 1941 this plant was required to submit only 4 forms. These 
were statistical-type reports filed monthly; three covered aluminum 
data and one asked for information about copper. By the first half of 
1943 the number had risen to 37 and thereafter fell off only slightly. 
Throughout the war period inventory data were most frequently re- 
quested while data pertaining to shipments and production were next in 
importance. 

Neither these figures nor others in the tables reveal fully the volume 
and detail of information asked on individual forms. Modifications of 
old questionnaires and new schedules tended to request more penetrat- 
ing information, in greater volume and detail, as production and dis- 
tribution problems became intensified in the period from early 1941 
through 1943. Thereafter there was a general tendency in the other 
direction, but it was not strong until the spring of 1945. 

Table III classifies the type of information reported by the frequency 
of the reporting period. This table demonstrates clearly that when 
considering the war period as a whole most information, particularly 
detailed information, was reported on a monthly basis. 





* The authors wish to acknowledge that Mrs. Elizabeth Joyce undertook, under the supervision 
of the authors, the analysis upon which this section is based. 
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Selected Categories of Information Requested on Fifty-Seven WPB Forms 


Filed by One Plant in Selected Periods of Time 





Total 


Periods of Time 














Informational Forms | 1941 1942 1943 1944 1945 
Category Dupl 
- Jan. July | Jan. July | Jan. July | Jan. July 
June Dec. | June Dec.| June Dec.| June Dee, 

Capacity 7 2 4 4 2 2 1 1 
Production 

Actual 16 3 6 7 11 10 9 10 10 

Scheduled 10 7 8 8 10 10 
Shipments 

Dollar 8 1 3 2 5 2 3 3 3 

Unit 23 4 7 10 17 14 12 12 12 
Material Consumption 

Total 13 1 4 7 11 9 6 6 6 

By products 4 1 3 3 3 2 2 2 2 

Anticipated 7 1 1 2 7 4 3 3 3 
Receipts or Accepted 

Deliveries 15 1 5 8 13 10 6 6 6 
Inventory 26 3 10 12 21 19 15 15 15 
Orders 

Unfilled 12 1 1 2 6 6 7 7 7 

New 6 2 5 4 3 3 3 

Canceled 3 1 3 3 3 2 2 
Material Requirements 14 1 5 7 12 8 7 7 7 
Labor 

Use 5 3 2 3 3 3 2 3 

Requirements 1 1 1 1 1 1 
End Use Pattern 13 1 1 6 8 8 5 5 5 
Number of Unduplicated 

Forms 57 4 15 20 37 36 32 31 30 























The frequency with which various types of information were reported 
over the war period on types of questionnaires is shown in Table IV. 
This table shows that most information was reported on statistical-type 
forms with shipment, production, inventory and material consumption 


data predominating. 


Table V on page 471 shows that most of the reports were very limited 
in scope. Throughout the war period the plant studied was requested 
to file the great bulk of its reports for individual products. 




















WPB STATISTICAL REPORTING, II 





TABLE III 
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Selected Categories of Information Requested on Fifty-Seven WPB Forms 


Filed by One Plant by Time Frequency of Reporting 




















Informational Total Forms Frequency 
Category Duplicated Monthly Quarterly One-Time Irregular 

Capacity 7 4 -- 2 1 
Production 22 13 6 1 2 

Actual 16 13 2 1 -- 

Scheduled 10 4 4 —_— 2 
Shipments 27 16 8 3 _ 

Dollar 8 1 6 1 — 

Unit 23 16 4 3 = 
Materials 

Consumption 18 10 5 1 2 

Total 13 9 1 1 2 

By products 4 1 3 _ _ 

Anticipated 7 4 2 1 -- 
Receipts or Accepted 

Deliveries 15 11 2 1 1 

Inventory 26 14 7 a 1 
Orders 15 10 a 1 _ 

Unfilled 12 7 4 1 _ 

New 6 6 _ _ — 

Canceled 3 3 —_ _ _ 
Material Requirements 14 7 4 _ 3 
Labor 2 _ a= oa 2 

Use 5 1 1 1 2 

Requirements 1 — — os 1 
End Use Pattern 13 8 1 1 3 
Number of Unduplicated 

Forms 57 25 10 7 15 




















Most of the forms for which the company was held responsible were 
monthly summaries. The frequency with which reports were filed is 
summarized in Table VI on page 471. 


Conclusions should be drawn from these quantitative data only after 
examination of other aspects of the reporting problem. These studies 
become more meaningful when considered in the light of outstanding 
qualitative characteristics of the WPB reporting system. 
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TABLE IV 








Selected Categories of Information Requested on Fifty-Seven WPB Forms 
Filed by One Plant by Character of Use 














, Statistical 
——- Fe nace g Statistical and | Application| Other 
attiead P Application 
Capacity 7 5 0 0 2 
Production 22 14 2 6 0 
Actual 16 14 2 0 0 
Scheduled 10 4 0 6 0 
Shipments 28 20 5 2 1 
Dollar 8 4 2 2 0 
Unit 23 18 es 0 1 
Material Consumption 18 8 8 2 0 
Totals 13 7 6 0 0 
By products a 2 2 0 0 
Anticipated 7 3 2 2 0 
Receipts or Accepted 
Deliveries 15 8 7 0 0 
Inventory 26 15 7 1 3 
Orders 13 8 1 3 1 
Unfilled 12 8 0 3 1 
New 6 5 1 0 0 
Canceled 3 3 0 0 0 
Material Requirements 14 2 9 3 0 
Labor 5 3 0 1 
Use 4 2 0 1 1 
Requirements 1 0 0 1 0 
End Use Pattern | 13 4 3 4 2 
Number of Unduplicated 
Forms 57 27 9 13 8 




















GENERAL QUALITATIVE CHARACTERISTICS OF THE WPB REPORTING 
STRUCTURE 


The WPB reporting framework developed various broad qualitative 
patterns which gave it a distinct character. The reporting structure in 
its formative years, and to a certain extent in its entire lifetime, was 
not flexible enough to meet new fundamental problems as they ap- 
peared. It did not have sufficient substance to permit the detection of 
crises before they arose. Except in connection with isolated product 
control problems the reporting structure followed rather than preceded 
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TABLE V 








Filed by One Plant on Fifty-Seven Forms 


Distribution by Time Periods of Company, Plant, and Product Reports 











. ‘ Company Plant Product 
Time Period Reseed Sunes Raent 
. Total (unduplicated) 3 2 52 
| 1941 0 0 4 
1942 
First Half 2 3 12 
Second Half 0 2 18 
1943 
First Half 1 2 34 
Second Half 1 2 33 
1944 
First Half 1 2 29 
Second Half 1 1 29 
1945 
First Half 1 1 29 














the changing character of needed controls. The reporting structure, 
therefore, was molded largely by improvisations generated out of the 
: pressures of events. Its ability to prepare for and, by so doing, to choke 

off serious production problems was not one of its sources of strength. 




















TABLE VI 
{ 
Distribution by Time Periods of Irregular, One-Time, Monthly, and Quarterly 
Reports Filed by One Plant on Fifty-Seven WPB Forms 
ss | Time Period Irregular One-Time Monthly Quarterly 
} Total 15 7 25 10 
a 1941 0 0 3 1 
E 1942 : 
; First Half 2 2 7 4 
4 Second Half 4 2 11 3 
e | 1943 
First Half 10 2 22 3 
s Second Halt 13 2 17 4 
\- 1944 
f First Half 14 0 12 6 
t Second Half 14 0 12 6 
d 1945 
First Half 
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The reporting system penetrated almost every conceivable aspect of 
industrial production activity, from the accumulation of total produc- 
tion data to cumbering incidentals. Moreover, WPB surveys did not 
stop with industry: they also touched farmers, housewives, retailers, 
wholesalers, financial institutions, and foreign Governments. The range 
of information sought was as expansive as the breadth of controls re- 
quired to channel the productive efforts of the industrial machine. 
Data requirements ranged from questions on operations of giant cor- 
porations to the needs of an individual farmer who wanted to buy an 
alloy steel bolt to fix his tractor. There were actually millions of appli- 
cations transmitted to the WPB by individuals and small businessmen 
to purchase items on restricted lists. 

Chronologically, the main evolution of the reporting structure per- 
tained to individual products and materials. Not only did this pattern 
fit the general philosophy of control but it fitted also the organizational 
structure of control agencies. Throughout the war period the basic 
fabric of the reporting structure was patterned after individual product 
reporting. The superimposition of comprehensive surveys upon these 
product reports created many intricate and serious technical problems. 
Except in a few areas, coordination of comprehensive surveys and indi- 
vidual product reports was never fully accomplished. 

The most commonly understood characteristics of the reporting 
structure probably were the overlapping of questionnaires, incon- 
sistency in reporting requirements, and duplication of incoming in- 
formation. In the chaotic period in mid-1942, major overlapping and 
duplication was inherent in such broad and basic reports as PD-275, 
PD-25A, WPB-732, and in individual allocation schedules. Within 
each of the individual allocation systems the same type of duplication 
existed. For molybdenum, for example, a steel producer might report 
molybdenum requirements on his alloy melt schedule, PD-391; his 
consumption and stocks of molybdenum on WPB-106; detailed molyb- 
denum requirements for melting on PD-258; molybdenum consumption 
on PD-259; and present his application and obtain his allocation for 
molybdenum, pursuant to order M-110, on PD-260. Many comparable 
situations arose from administrative expediency which did not admit 
sufficient time to eliminate such conditions through coordination. 

Throughout its existence, although in lessening degree, the reporting 
structure revealed an amazing lack of procedural unification and 
standardization. This extended from questionnaire format through 
coverage, tabulation, and informational usage. The WPB, for example, 
constantly struggled with the problem of standardizing application 
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forms relating to L and M orders. In mid-1942 the Chemicals Division 
devised two standard forms for its orders (PD-600 for consumers of 
chemicals and PD-601 for producers of chemicals) which eventually 
were used to implement 117 orders. This type of experience, however, 
was not common in the WPB and most orders carried their own spe- 
cialized forms. 

Technical methodology developed in the reporting structure gen- 
erally evolved through three stages. The first stage was characterized 
by difficulties in grasping problems which reports could solve, in recog- 
nizing the applicability of old reporting procedures, or in developing 
new procedures to solve them. In this phase, fumbling, experimentation 
and false starts predominated. Out of these experiences a second stage 
emerged in which practical methods were sorted from impractical ones 
and incorporated in recognized reporting standards. The third stage 
was characterized by an application of these methods to the problems 
at hand and the development of modifications by which they were 
made to fit better the needs of management. 

Although most methods developed along these lines it must not be 
inferred that there was a free interchange and free use of methods 
among the WPB’s personnel so that tested techniques used by others 
were readily adopted in all areas where they were applicable. WPB 
personnel groups showed a strong resistance to accepting methods and 
procedures adopted by others. Each new problem was considered to be 
highly specialized and to demand higi::y specialized methods for its 
solution. Although on the surface new problems often appeared to be 
different from others which had arisen, most problems displayed com- 
parable fundamental characteristics. Their solution, in so far as that 
could be accomplished by statistics, thus merited review of techniques 
adopted in other areas. But such review was never sufficiently general 
to result in free interchange and mutual acceptance of common experi- 
ence. 

The WPB reporting system was considered a “step-child” by far too 
many administrators. Its behavior, except by those fostering it, was 
often viewed with suspicion. It was characteristic of WPB personnel 
to disparage the results of all surveys except their own, particularly 
when the data from each bore on a common problem. This attitude was 
probably more indicative of ignorance regarding statistical methods and 
procedures than of unjustified affection for one’s own data accumula- 
tion. Each person sponsoring and using information generally was 
aware of its strengths and weaknesses. But the complexity surrounding 
the collection and use of each set of data was such that generally 








474 AMERICAN STATISTICAL ASSOCIATION JOURNAL SEPTEMBER 1943 


nothing short of close familiarity with all details of the data could pro- 
vide full comprehension of them. 

The reporting structure was subject to rapid changes not only in 
complete basic control systems but also form details. We have already 
seen the outlines of swift changes in broad control mechanisms. Even 
quicker changes took place with particular forms. Many of the one- 
time questionnaires appeared suddenly and without previous notice, 
Among the periodic or recurring reports rapid and unannounced data 
changes often took place. These changes were unusually swift and sub- 
stantial early in the war period but even toward its end sudden and 
radical alterations of old forms were frequent. In 1944 central review 
action was taken on a total of 2,841 questionnaires. Of this number 
1,933 were revisions or extensions of existing reports. Some revisions 
were of no consequence; others were so extensive as to constitute a 
completely new questionnaire. 

The reporting structure, as an instrument or tool of industrial 
management control, reflected either directly or indirectly all the in- 
tricate wartime production control problems. It was the impact of 
these problems on the use of reporting for management purposes, and 
the resolution, irresolution, and compromise exercised in solving them, 
that gave the reporting structure its character and direction. An ap- 
praisal, therefore, of the WPB reporting structure in terms of statistical 
reporting ideals cannot be realistic without a recognition of these in- 
fluences. 


PART III 


CONTROLLING THE ISSUANCE OF QUESTIONNAIRES 


Fortunately, in the earliest days of the NDAC, a central clearance 
function was established to approve or disapprove issuance of ques- 
tionnaires. In retrospect, the need was obvious for a centralized review 
of proposed data collections to avoid duplication, to coordinate report- 
ing, to maintain technical standards, and to insure administrative 
efficiency in the use of collected data. Through central control, all 
problems relating to questionnaires were brought to a focus. 

Before a questionnaire could be approved for issuance, it was neces- 
sary to compromise a variety of pressures relating to its issuance. The 
range of WPB data requirements and methods was so broad and com- 
plex, the impact upon industry so difficult to measure, and the rela- 
tionship between administrative responsibility and statistical proce- 
dural standards so intricate, that questionnaire issuance created thorny 








eo, © @ © @ ws 





















































ae 





WPB STATISTICAL REPORTING, II 475 


problems which by their very nature could not be answered to the 
satisfaction of everyone concerned. On the one hand, for example, were 
the desperate appeals for clearance of requests “without which the 
WPB cannot operate.” On the other hand were complaints from 
industry that “we are producing weapons not statistics.” Compromising 
such conflicts was difficult. 


LEGAL BASIS OF CENTRAL CONTROL OVER FORM ISSUANCE 


The wartime scramble for facts by Government agencies led to a 
substantial volume of Government questionnaires to industry. Wide- 
spread criticism of poorly devised report forms, duplication of data 
requests, unintelligible instructions, and the rapidity with which re- 
quests multiplied, reached a peak early in 1942. Action was not slow in 
coming. Corrective measures were forced on various Federal agencies, 
committees formed to study the problem, and various business ad- 
visory groups were pressed into service. The most visible result of these 
actions was the passage in December 1942 of the Federal Reports Act 
of 1942. 

The Federal Reports Act formalized responsibilities of Federal agen- 
cies in their data collection work and provided a basis for the Bureau 
of the Budget to control all data-collecting activities of Government. 
The act specifically directed the Bureau of the Budget: “to investigate 
the needs of the various agencies for information from business enter- 
prises and other agencies; to investigate the methods used in obtaining 
such information; and to coordinate information-collecting services in 
order to reduce the cost of the Government and minimize the burden 
upon business and industry and utilize, so far as practicable, files of 
available information and existing facilities of the established Federal 
departments and independent agencies.” 

The NDAC in large measure had anticipated the Federal Reports 
Act of 1942 and by so doing had prevented a far more chaotic situation 
than that which existed in ths early war period. At the first meeting 
of the Advisory Commission to the Council of National Defense, held 
on June 12, 1940, it was stated that “experience had already demon- 
strated the necessity for establishing a central office through which all re- 
quests of the Commission for information would be cleared, that a plan 
was being prepared, and that a tentative suggestion as to the organiza- 
tion of such an office would be presented to the Commission not later 
than Saturday, June 15, 1940.” In the summer of 1940, before the 
NDAC began to collect data from its own questionnaires, the Division 
of Statistics of NDAC, jointly with the Bureau of the Budget, studied 
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the reporting experience of the War Industries Board in World War I. 
Analysis of problems faced by the War Industries Board (which did 
not have formal centralized form control) led to the recommendation 
that the NDAC should create a central clearance point for data requests 
originating within the Commission. A control office was created but it 
operated informally and had no power to compel compliance with its 
decisions. 

Results achieved by this office in coordinating data requests and in 
improving the character of questionnaires issued, however, led to more 
formal authority. This was established, after the OPM absorbed 
NDAC, by the issuance of OPM Administrative Order No. 4. 

Administrative Order No. 4 required every office of the OPM to 
obtain the approval of the Division of Statistics before any formal in- 
quiry could be sent to industry. WPB General Administrative Order 
No. 6, issued in February 1942, carried this clearance requirement 
through the new WPB. Criteria set forth in the order to govern the 
review of questionnaires are worthy of note. They were: 

“1. That the information to be requested is needed at the time it 
is to be filed and that the need justifies the effort and expense 
on the part of both industry and government in order to 
obtain it. 

2. That the data are not already being secured by any other 
federal agency, or, if being secured, are not and cannot be 
made available to the War Production Board. 

3. That the request includes clear and specific definitions of the 
required data. 

4. That the required report is adapted as closely as possible to 
the types of records ordinarily maintained by business con- 
cerns or available to them. 

5. That staff and equipment are available to tabulate or other- 
wise process the returns when received.” 


The Office of Survey Standards was established in the Division of 
Statistics to carry out the provisions of this Order. Passage of the Fed- 
eral Reports Act of 1942 necessitated only one revision in WPB pro- 
cedure: each report form in use was given an approved number assigned 
by the Bureau of the Budget. 

There was continued in the OPM, from the NDAC, an Office of 
Procedures which also was responsible for the review of questionnaires. 
The Office of Procedures carried broad responsibility for reviewing all 
administrative procedures relating to all plans, programs, orders, and 
regulations proposed for issuance by the OPM. This Office carried re- 
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sponsibilities in administrative procedures comparable with those of 
the Office of Survey Standards in questionnaire methods and stand- 
ards. As part of its operations the Office of Procedures reviewed ques- 
tionnaires in terms of administrative practicability, relationships with 
control programs, and other procedural matters. In exercising this func- 
tion there was a clear demarcation from functions of the Office of 
Survey Standards. The functions of both, however, shaded into each 
other. 

This overlapping proved to be a source of strength rather than 
weakness. Nevertheless demarcation was required. In April 1943 a 
working agreement was reached to formalize previous informal arrange- 
ments. It was agreed that the Office of Survey Standards would confine 
its review of data requests to “statistical problems, such as: (1) review 
of the kind of data to be secured; (2) the form in which such data was 
to be secured, e.g., pounds, feet, tons, gallons; and (3) the instructions 
for filling in the requested data.” On the other hand, the Office of Pro- 
cedures confined its review to “administrative impact,” that is: (1) 
the administrative necessity for the form, e.g., if the method of control 
provided in the order was questioned, the form implementing such 
control ipso facto was questionable; (2) the possibility of consolidation 
of forms; and (3) instructions covering the number of copies of forms 
which were to be filed, with whom filed, and so on. Broadly, the former 
matter was statistical and the latter procedural. 

This agreement crystallized a procedure which had existed long be- 
fore. Data requests followed a simple path. First, the Office of Survey 
Standards reviewed new questionnaires. If the questionnaires were 
approved they moved to the Office of Procedures for review against 
existing and efficient administrative procedures. Since our interest is 
primarily statistical rather than procedural the remainder, although 
not all, of this section concerns operations of the Office of Survey 
Standards. The concept of “centralized questionnaire control” does, 
however, include operations of both groups. 


CLEARANCE PROCEDURE 


Rudimentary principles of efficient action for clearance of data re- 
quests were so well recognized by those familiar with the problems in- 
volved that no serious question of clearance procedure arose. Varying 
circumstances surrounding the adoption of each new reporting pro- 
posal, however, often resulted in short-cuts to established lines of ac- 
tion. Nonetheless, most inquiries went through much the following type 
of critical review. 

The bulk of data requests (over 90 per cent) originated in theWPB 
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Industry and Material Divisions. The divisions were staffed for the 
most part with businessmen fresh from industry, who were presumed 
to have some knowledge of the data needed for control and of the 
ability of their industry to supply the requested information. These 
businessmen had help from professional statisticans assigned to each 
division, who were presumed to have technical skill in collecting and 
using data. The idea that this first screening would result in a mini- 
mum of ill-conceived questionnaires, however, was not supported by 
the facts. 

After these preliminary reviews the data request was submitted to 
the Office of Survey Standards. The transmittal folder contained a 
transmittal letter; a memorandum of justification; copies of the pro- 
posed questionnaire; and other necessary papers such as processing 
directions, printing requisitions, mailing instructions, and so forth. The 
Office of Survey Standards was required to approve or disapprove the 
data request within three days of submission, a time schedule not al- 
ways met, or to notify the initiating unit of objections. Copies of the 
form, memorandum of justification, and other papers were routed to 
the Bureau of the Budget during this three-day period to notify the 
Bureau of the proposal and to allow concurrent review by the two 
Offices. If it became necessary to consult the unit originating the report, 
a representative of the Bureau of the Budget was invited to be present 
to prevent unnecessary repetition of interviews. 

When a data request was approved by the Office of Survey Stand- 
ards, a formal request for approval was made to the Bureau of the 
Budget under the provisions of the Federal Reports Act of 1942. Be- 
cause of the practice of concurrent review, there were probably not 
more than a half dozen cases in which the Bureau of the Budget dis- 
approved a form after formal request for approval by the Office of 
Survey Standards. Decisions to modify or disapprove data requests 
were discussed with the originating unit to reach agreement on issues 
involved. In most instances this practice led to compromises acceptable 
to the parties concerned. But if no compromise agreement could be 
reached, and where the originating unit could not be convinced of the 
need for withdrawal of the request, no alternative was left but to set 
forth formally the reasons for rejection. 

Having obtained approval by the Bureau of the Budget for a ques- 
tionnaire, the Office of Survey Standards affixed the Budget Bureau 
approval number, signed the print requisition, and forwarded the final 
draft of the form and the printing and mailing requisitions to the Office 
of Procedures. That Office, which had charge of coordinating pro- 








a nD aA 


—e 


at Ch oth tte SS tt ot hth CF 





GE, TRIE ETT 











WPB STATISTICAL REPORTING, II 479 


cedures used by various divisions, was consulted in advance by the 
Office of Survey Standards if a proposed data request implied a change 
in established procedure. Consequently, obtaining written approval of 
the Office of Procedure was a fairly simple formality. The Office of 
Procedures, after review, routed the documents to the Printing and 
Distribution Control Division which had jurisdiction over the layout, 
printing and bulk distribution of all WPB forms. 

This general review procedure altered little during the war period. 
This was partly due to legal restrictions, partly to a general recognition 
of the soundness of the procedures, and partly to the fact that the 
procedures were flexible enough to operate effectively in a rapidly shift- 
ing reporting structure. From time to time, however, this procedure 
was supplemented by the work of special committees. 


COMMITTEE FOR THE REVIEW OF DATA REQUESTS FROM INDUSTRY 


At no time during the war period was there an absence of strong 
industry criticism of WPB forms. WPB data requests constituted a 
burden on industrial operations. One of the first efforts to lighten the 
burden was begun in June 1942 by a Committee for the Review of 
Data Requests from Industry. 

This committee, composed of representatives from business, the 
Army, the Navy, and the Bureau of the Budget had wide powers to 
correct shortcomings of WPB questionnaires. The establishment of 
the Committee, and its system of sub-committees and industry ad- 
visors, did not conflict with past clearance procedure. Rather, it 
created a ‘‘Court of Appeals” which constituted a new clearance hurdle. 

The Committee had the support of all responsible executives in the 
WPB and all others interested in improving the elaborate and extensive 
questionnaire structure created by the WPB. Each WPB division re- 
sponsible for data requests was asked by the Committee to review its 
questionnaires and to supply the Committee with a statement of repet- 
itive report forms in use showing: those which should be continued, 
those which should be modified, and those which could be discontinued. 
An elaborate system of subcommittees was created to review various 
types of questionnaires. Industry representatives were invited to visit 
Washington and, in collaboration with the Committee, to make a re- 
view of each repetitive reporting form. Further, the Committee solic- 
ited criticism and suggestions from over 1,000 trade associations and 
their members in a drive to ferret out unnecessary duplication, to define 
areas where consolidation of reports could be effected, and to restudy 
the reports which caused difficult filing problems. In response to this 
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request, a total of approximately 40,000 letters were received in mid- 
1942 embodying the important complaints, criticisms, and suggestions 
of industry. 

The vast energy expended and the calibre of talent used produced 
disappointing results. These were made the poorer by exaggerated 
claims of success by the Committee. The Committee itself was responsi- 
ble for relatively few revisions or discontinuances. Most revisions and 
discontinuances claimed had been planned by the WPB divisions be- 
fore the Committee was established. Pressures generated by industry 
criticism and the existence of the Committee no doubt hurried the 
process but the fact remains that relatively few revisions or discon- 
tinuances resulted from the literally blinding light focused on each 
inquiry. These facts support the conclusion that the complex and bur- 
densome reporting system was the result not of poor questionnaire 
control, inadequate policing, or the absence of self discipline, but rather 
of confused basic WPB policies. And despite the vast reporting struc- 
ture, there is reason to believe that the WPB erred on the side of too 
little information rather than on too much. 

The quality of industry’s critical response was also disappointing. 
Analysis of the thousands of letters from industry demonstrated that 
the burden of the WPB requests on industry, and industry objection to 
reporting forms, centered almost exclusively on the basic material re- 
port forms used under the individual allocation systems. The principal 
difficulty of industry in filing these forms resulted from the request for 
end use information which industry neither had nor could readily get. 
Of the hundreds of forms in use, only a handful outside of the raw 
material control area were the subject of industry comment. The 
trouble lay not in the report forms themselves but in the material 
control systems which they implemented. Most of the report forms in 
use were needed to facilitate operations under the various overlapping 
and inconsistent control systems in force. Before order could be brought 
into the reporting system there had to be a definite policy established 
for control of material flows. The coordinated efforts of industry in 
criticizing the WPB questionnaires added practically nothing to what 
was already known to statistical technicians and the top staff of the 
WPB. 

Even though the Committee was less successful than its claims, it 
performed a valuable function. It focused the attention of businessmen 
both in the WPB and in industry on the problems of wartime informa- 
tional requirements, especially the character of the information needed 
by government and the ability of industry to supply the data. As a 
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result a decision was made to establish on a permanent basis the ma- 
chinery through which the viewpoints of industry could be represented 
in the issuance and continuance of the WPB’s reporting forms. 

In October 1942 the permanent position of Chairman of the Commit- 
tee on Data Requests from Industry was established and filled by a 
business man. This function continued well into 1944 when, as a result 
of a sharp decline in the introduction of new data requests, the position 
was abolished. 


ADVISORY COMMITTEE ON GOVERNMENT QUESTIONNAIRES 


In addition, by a letter dated October 9, 1942, the Director of the 
Bureau of the Budget requested the President of the Controller’s 
Institute of America to establish an Advisory Committee on Govern- 
ment Questionnaires composed of five members, one representing each 
of the following industry organizations: the National Association of 
Manufacturers, the American Trade Association Executives, The Con- 
trollers’ Institute of America, the Chamber of Commerce of the 
United States, and the American Retail Federation. The Committee 
was asked to take steps necessary “to make the advisory services 
of business available to Government agencies... in promoting the 
improvement, development and coordination of all federal statistical 
services.” The Committee, later expanded to include representation 
from the National Association of Commercial Organization Secre- 
taries, and the National Industrial Council, was independent of gov- 
ernment and responsible only to the business community. Its functions 
were purely advisory. 

The Committee operated in large part through subcommittees. 
Some groups restricted their work exclusively to a single industry, as 
for example questionnaire problems of the arc welding industry. In 
other cases, subcommittees represented production segments of an 
industry, such as the organic chemical industry, or a functional part of 
an industry, as for example problems of accounting in questionnaires of 
the public utility industry. In still other cases subcommittees devoted 
their attention to general problems that affected many lines of business, 
as for example industry’s problems in filing WPB-732. Subcommittees 
met regularly and devoted themselves to the intimate details of report- 
ing problems. Panels were created in emergencies to consider nonre- 
curring problems. Exclusive of such panels, 58 subcommittees were 
appointed during the first year of the Advisory Committee’s existence. 

The Advisory Committee performed a valuable service to govern- 
ment, a service not measurable in quantitative terms. During its first 
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year of operation some 300 specific recommendations were made to the 
Bureau of the Budget and substantially all were accepted both by the 
Bureau and the agencies whose programs were affected. There were 
also hundreds of unrecorded instances where questionnaires were volun- 
tarily revised and substantially improved by persons and agencies 
affected before their original issuance. In addition, as noted by the 
Director of the Bureau of the Budget in a letter to the Chairman of the 
Advisory Committee on Government Questionnaires, dated December 
3, 1943, its existence encouraged greater care in the development of 
governmental forms and greater attention to the abilities of respondents 
to supply the information requested. Although these considerations 
applied to all Government agencies, the bulk of the work of the Com- 
mittee pertained to questionnaires of the WPB since its surveys loomed 
so large in the total of all Government inquiries. 


TASK COMMITTEE ON REDUCTION OF SPECIAL APPLICATIONS 


In mid-1943 renewed industry complaints about the WPB’s informa- 
tional demands coincided with a general over-all improvement in sup- 
ply-demand relationships in many basic materials and intermediary 
products. These circumstances made it difficult to visualize the need 
for continuing many reports. A Task Committee on Reduction of Spe- 
cial Applications was therefore established by the Operations Vice 
Chairman and the Program Vice Chairman to determine the minimum 
number of special applications that should be required from industry. 
Primary responsibility for a general housecleaning was placed upon the 
WPB Division Directors. Since the Chairman of the Task Committee 
was also the Deputy Vice Chairman for Operations with full authority 
to act when a decision was made, prompt results were expected. 

A reduction of 40 per cent in the applications required from industry 
was set as the objective to be achieved through the following methods: 

“1. Elimination of allocation or special authorization procedures 
wherever supply warrants. 

2. Substitution of limitation orders or restrictive provisions in 
orders for those which now require special applications. 

3. Small order exemptions to be incorporated in the existing or- 
ders so that not less than 10 per cent of supply will be dis- 
tributed in this manner unless conclusive evidence is given 
justifying a lower percentage. 

4. Replacement of consumer applications by suppliers’ order 
board reports. 

5. Substitution of quarterly applications for monthly applica- 

tions. 
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6. Substitution of quotas for special applications (including MRO) 
now required wherever consumption and use have been estab- 
lished.” 

Coincident with work on weeding out unnecessary application forms, 
the Task Committee was to attempt standardization wherever possible, 
particularly as to end-use information; to ask enly for facts readily 
available to manufacturers and eliminate those questions which manu- 
facturers had to answer from supplier or customer records; and to re- 
duce wherever possible paper work other than questionnaires. 


COMMITTEE ON DEMOBILIZATION OF CONTROLS AFTER V-J DAY 


In the summer of 1944, when the collapse of German resistance 
seemed to be imminent, the pace of reconversion preparations, which 
had been in progress from as far back as the spring of 1943, accelerated 
quickly. Leading the reconversion plans was a Committee on De- 
mobilization of Controls after V-E Day, familarly known as copcavez. 
When the Battle of the Bulge in the winter of 1944—45 shifted attention 
from reconversion to renewed war-production, a new Committee on 
Period One (CPO) was established in February 1945 to review the 
program developed by concave and keep it up to date. This Commit- 
tee functioned until V-J Day. When Germany collapsed it had a com- 
plete program for relaxation of controls ready to be put into effect at 
once. Despite the unexpectedly short period between V-E Day and 
V-J Day, it had a similarly complete plan when the Japanese surren- 
dered. 

The fundamental concern of these Committees, of course, related to 
basic production and distribution controls. But accompanying those 
controls were reporting instruments. Plans concerning questionnaires 
were developed by a Sub-Committee on Reports and Data Requests. 
As a part of the work of the principal Committee, every control policy, 
order and regulation was reviewed by the subcommittee. 

The principal objectives which the subcommittee established in re- 
viewing questionnaires are worthy of note. First, of course, was the 
desire to eliminate all non-essential reports growing out of relaxation 
of basic controls and management needs. Second, it was decided to 
maintain an intelligent reporting program which would permit the 
WPB and other war and reconversion agencies to be properly informed 
and prepared to take any necessary action in the reconversion period. 
Third, it was considered important to preserve in an uninterrupted 
series all reports which could be regarded as having post-war value to 
permanent Government agencies. In contrast with the approach to 
comprehensive elimination of restrictive controls, it was felt that in re- 
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vising its reporting requirements following V-E Day, the WPB should 
err on the side of caution rather than over-hasty relaxation. The ex- 
perience of five years of struggle in building up a reporting system 
and in fully understanding the strategic role which it played in manage- 
ment was not lost. It was considered to be good policy for the WPB 
to keep its “steering gear and brakes in good condition,” a function 
which could be performed by the maintenance of minimum reporting. 

Preparations made on the basis of these principles resulted in a sub- 
stantial reduction of questionnaires upon the arrival of V-E Day anda 
further substantial reduction on V-J Day. The reductions were orderly 
and resulted in the maintenance of reporting instruments which proved 
to be of great value in the post-war reconversion period. 


CENTRAL REVIEW PROBLEMS AND POLICIES 


The core of the central review problem rested in the effort. to satisfy 
all competing pressures present at the point of central review and at 
the same time gradually raise technical levels of statistical proficiency 
and sound operating procedures. Each data request originally had to 
be reviewed not only as a technical and professional problem but also 
in the light of a wide variety of complex influences growing out of the 
nature of the WPB organization, the personalities of administrators, 
the status of various control mechanisms, the degree of statistical co- 
ordination present, the urgency of data requirements, present and pro- 
spective control policy, industry’s ability and willingness to comply, 
the nature of other comparable questionnaires and literally dozens of 
other such political, administrative, and technical issues. And once a 
decision was made it could not be forgotten. Review of past actions was 
a necessary accompaniment to original control, for the WPB divisions 
often maintained and supported questionnaires whose usefulness had 
ended or whose efficiency could be improved by the application of newly 
tested technical methods. 

Fundamental in the policy of form clearance was the idea that better 
results would be obtained in the long run if emphasis were laid not so 
much upon rigid adherence to fixed procedures as upon the personalities 
concerned with questionnaire formulation and data use. WPB’s per- 
sonnel was a heterogeneous group of administrators, with varying de- 
grees of background in methods of data collection and use, and with 
substantial authority and responsibility in the management of in- 
dividual products and materials. It was natural for these individuals 
to insist that, upon the basis of their knowledge of their own indus- 
try and war needs, a survey developed by them not only was necessary 
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but also administratively practical. It was hard for central clearance 
to argue against the feeling that “it is necessary to have this informa- 
tion in this way to fulfill the responsibilities of this office.” Reconcilia- 
tion of such ideas with technical standards and industry’s ability to 
comply was often not easy. 

The approach to such questions was conciliatory. The goal was to 
nurture acceptance of the idea that questionnaire clearance existed to 
aid administrators to solve reporting problems, to harmonize their 
methods with others in the WPB, and to utilize experience of others in 
making the solution of their problems easier. It was a policy of aid 
rather than dictation. 

Such policies were effective. They generally engendered a feeling of 
reason into discussions of problems which otherwise might have been 
settled arbitrarily with resulting appeals to higher authority and bitter 
conflict. 

Neither administrators nor industry advisory groups were always 
thoroughly familiar with the record-keeping practices of industry from 
which data requests had to be filled. There were many instances of 
violent critical reaction from industry to forms which had been ap- 
proved by the WPB and industry advisory committees. In some in- 
stances forms had to be withdrawn because industry was in no position 
to respond. In others, special “Task Groups” had to be established to 
simplify questionnaires before the great majority of respondents could 
be expected to comply. There were many instances in which those who 
should have had knowledge of their industry’s ability to file reported 
information were responsible for data requests which could not be filed. 
Often, against the limited knowledge of sponsors of questionnaires, 
clearance had to array information concerning the ability of industry 
to comply. 

In addition, of course, were the problems of matching the proposed 
questionnaire against technical standards, acceptable statistical 
methodology, the WPB’s control policies, and sound administrative 
procedure. From the beginning it was accepted practice in the clearance 
function to assume that those responsible for policy and administrative 
action should themselves define their information needs. In view of this 
policy, form clearance devoted itself to solving those problems which 
might jeopardize the collection, tabulation, and use of required sta- 
tistics. But this was not always easy and the line between conciliation 
and dictation often was blurred. 

Important in the functioning of centralized form clearance was the 
adoption of technical statistical standards and methods to current 
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questionnaire problems. The form clearance problem was one of 
watching experience throughout the WPB and applying that experience 
to the acquisition of information needed for management. The history 
of form clearance in this regard was an uphill climb in instituting ac- 
cepted methods in connection with the use of “cut-off” points, stand- 
ardization of forms applicable to general orders, gearing data requests 
to industry record-keeping practices, eliminating “interesting figures” 
not essential for the purposes of the questionnaire, and generating a 
freer flow of information from one division to another to avoid duplica- 
tion and overlapping of data requests. Many desirable technical meth- 
ods could not be introduced arbitrarily but through a process of edu- 
cation their incorporation resulted in a gradual improvement in ques- 
tionnaires. As experience brought with it assurance of rightfulness 
there were many technical standards applicable to surveys and ques- 
tionnaires that had to be met before form issuance could be approved. 

Relating reporting to policy raised a dual problem. On the one hand, 
when a policy was established which resulted in questionnaires with 
which industry could not comply an effort had to be made to alter the 
policy. On the other hand, when a policy was established and a ques- 
tionnaire contradicted that policy the questionnaire had to be modified 
or rejected. Implications of these problems were far-reaching. 

Policies connected with early individual allocation systems illustrate 
the first type of problem. As discussed previously, most of the early 
allocation systems relied heavily upon acquisition of information which 
it was often exceedingly difficult and sometimes impossible for indus- 
try to report. The problem was attacked in a number of ways. First, 
an effort was made to develop in the WPB an understanding of the dif- 
ficulties in industry in reporting end-use information. Increasing under- 
standing eventually led to requests for types of end-use information 
generally available in the records of respondents. As a result, many 
so-called end-use systems developed into use systems. In many areas, 
however, as noted elsewhere, such results were not achieved. Second, an 
effort was made to prevent the rise of numerous new allotment systems 
which conflicted with those in use. Although form clearance did not 
have authority to correct or initiate fundamental policy it did have 
authority to disapprove new questionnaires. Judicious exercise of this 
power resulted in discouraging many new allotment schemes. Third, 
strenuous efforts were made to eliminate duplication and overlapping 
of reporting by the modification of existing questionnaires supporting 
the systems. The eventual effect was to crystallize control policy and 
coordinate many individual allocation systems into workable patterns. 
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Tying reporting to established policy was often more difficult when 
basic policy was itself confused and overlapping. In such circumstances, 
relating questionnaires to policy was not a simple undertaking. The 
coterminous existence in 1942 of the Production Requirements Plan 
and the many individual allocation systems is a case in point. Consid- 
erable conflict and duplicationexisted among these systems and provided 
a fertile field for form clearance to launch a unification program. Effec- 
tive corrective measures, however, were impossible in the absence of 
clear-cut top policy. Means for taking some corrective measures came 
only with clarification of basic metal allocation policy upon the an- 
nouncement of the Controlled Materials Plan. 

Developments relating to scheduling of components provide another 
illustration of the way in which reporting standards were sometimes 
subordinated and thwarted by ill-defined basic policy. Before the 
adoption of M-293 (the General Scheduling Order) various WPB divi- 
sions had instituted action to request manufacturers of fabricated com- 
ponents to report their individual orders. The purpose of such ques- 
tionnaires was to place the WPB in a position to direct shipment of a 
specific item to a specific customer. Such a procedure not only was con- 
trary to the basic policy of the priority system as it then existed but 
necessitated more personnel to handle the mass of data which would be 
collected than the divisions possessed. In addition, any benefits which 
might have been obtained were small when compared with the tre- 
mendous reporting burden such questionnaires would have created for 
industry. With few exceptions, these requests were therefore disap- 
proved. 

Issuance of M-293 early in 1943 altered WPB basic policy and also 
clearance policy with respect to scheduling forms. Central clearance 
recognized the fact that the mass of information required by M-293 
could hardly be used in a fashion to justify the heavy reporting burden 
imposed on industry. But central clearance, in the face of frm WPB 
policy, was helpless to disapprove the. questionnaires for such reasons. 

Estimates of the impact of WPB questionnaires on industry were 
always considered in the review of questionnaires. The establishment 
of committees to study the problem has already been discussed. Be- 
yond the problem of eliminating needless questionnaires lay the more 
intricate problem of adapting questionnaires themselves to fit industry 
records. Interest in this problem was not entirely the result of concern 
for industry. By facilitating industrial response the WPB could obtain 
more accurate, complete, and prompt response to data requests. 

Attention had to be given to programs of other war agencies and the 
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utilization of other agencies in both the mailing and tabulating of ques- 
tionnaires. The clearance group sought to coordinate, directly and 
through the Bureau of the Budget, the WPB reporting program with 
that of other Federal agencies. In addition, arrangements were made 
with the Bureau of Mines, the Bureau of Labor Statistics, the Bureau 
of the Census, and other agencies to modify their regular reports to 
meet special needs of the WPB. In some cases, WPB orders made man- 
datory the filing of another agency’s report since that was the easiest 
method to obtain necessary information. 

Of much importance in the WPB’s operations was the use of other 
agencies in collecting and tabulating WPB reports. These agencies 
made available to the WPB technical skills in coding, editing, and tabu- 
lation questionnaires. Arrangements for acquiring these services were 
made through the clearance group. 
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Experimental Designs in Sociological Research. F. Stuart Chapin (Professor of 
Sociology, Chairman of the Department, andsDirector of the Schoolgof Social 
Work, University of Minnesota, Minneapolis, Minn.), New York 16: Harper & 
Brothers (49 East 38rd St.). 1947. Pp. xi, 206. $3.00. 


Review By O. KEMPTHORNE 
Associate Professor of Statistics, Iowa State College 
Ames, Iowa 


I" THE introduction the author states “the newly developed research meth- 
ods of experimental design, applied in the community situation, seem to 
point the way by which we can overcome, in the course of time, some of the 
complexities of social interaction that have hitherto baffled rational and 
objective description of certain acute problems of human relations.” There 
can be no disagreement with this statement; and, in fact, one could well take 
the viewpoint that until the general methods of experimental design as 
applied, for example, to biological research are part of the approach to the 
investigation of social situations very little progress can be made. It is un- 
fortunate then that the author has gathered together in this book, as note- 
worthy examples of the use of experimental designs in sociological research, 
nine studies, only one of which would satisfy the elementary basic require- 
ments of a modern experiment. This statement will be dealt with later in the 
review. First, however, it would be well for the reviewer to state that he 
realizes the difficulties which arise in any attempt to apply to sociological 
research the principles of experimental design as laid down for example by 
R. A. Fisher in The Design of Experiments. It is very easy for the agronomist 
to divide his experimental area into plots to which treatments are applied 
at random, whereas the division of a community into several parts for ran- 
dom application of sociological treatments, such as varying types of housing, 
is extremely difficult. It is possible to visualize the application of sociological 
treatments to communities selected at random, though the pressure of public 
opinion may prevent research workers from adopting this procedure for any 
important problems. Professor Chapin discusses these difficulties at some 
length throughout the book, and the biologist or agronomist can only feel 
happy that his problems are so much simpler. 

The chapters of the book are entitled: (1) Natural Social Experiments by 
Trial and Error; (2) Three Experimental Designs for Controlled Observation; 
(3) Cross-Sectional Experimental Design; (4) Projected Experimental De- 
sign: The Classical Pattern of “Before” and “After” Experiments That Oper- 
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ate from the Present to the Future; (5) Ex Post Facto Experimental De- 
sign: From Present to Past; (6) Sociometric Scales Available for Control 
and the Measurement of Effects; and (7) Some Fundamental Problems and 
Limitations to Study by Experimental Designs. This is an imposing array 
of chapter headings with four different types of experiments mentioned, 
natural, cross-sectional, projected, and ex post facto. The question arises 
immediately of what the author understands by the words “experiment” 
and “experimental design.” The word “experiment” has been used presum- 
ably for centuries to describe the taking of observations according to some 
plan, though in the biological sciences it now connotes the application and 
comparison of various different treatments or technical procedures. The 
necessity for clear thinking on this point has been emphasized in a recent 
paper* where the author differentiates between two types of experiment, 
the one for example to measure an absolute constant such as the velocity 
of light and the other for example to evaluate the effect of a special dietary 
supplement. The usage of the word “experimental design” is well established 
by now to mean a plan for performing a comparative experiment. This implies 
that various treatments are actually applied by the investigator and are not 
just treatments that happened to have been -applied to particular units 
for some reason, known or unknown, before the “experiment” was planned. 
This condition rules out practically all of the experiments and experimental 
designs discussed by the author. This so-called “natural” experiment should 
be described as a phenomenon and has little inductive value; for it is well 
established that the same set of environmental conditions will not produce 
the same effect in all cases. The “cross-sectional” designs discussed have a 
superficial appearance of validity, and would be completely valid if (and only 
if) a randomization procedure had been adopted in assigning treatments. 
The fact that pairs of individuals have been matched on several control 
factors does not mean that they are identical: matching of experimental 
units is the normal procedure in all types of experiment, but is useless 
without randomization and replication by which the closeness of the match- 
ing may be estimated. Experimental errors are frequently quoted in the 
book when there is neither randomization nor replication of the treatments. 
The author appears (Chap. 7 and elsewhere) to be aware of these considera- 
tions, which if given their proper weight would have resulted in the exclusion 
of practically all the studies on the grounds that they are examples of the 
misuse of basic concepts and are incapable of unbiassed interpretation. The 
chapter on “Projected Experimental Design” contains the only one of the 
nine studies in which the principle of randomization was used, namely Dr. 
Reuben Hill’s study on “Staff Stimulation to Social Participation and Social 
Adjustment.” This study was not, however, entirely satisfactory because of 
the population of 306 freshmen considered, there were only 171 for whom 
complete records were available. It appears that what the author terms a 





* Anscombe, F. J. “On the Validity of Comparative Experiments.” J Royal Stat Soc (In press). 
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projected design is satisfactory if the concepts of randomization and replica- 
tion are used properly. The reviewer fails to understand how the “Ex Post 
Facto Experimental Design” can be given a logical justification. An example 
of this “design” was the determination of the relationship of public housing 
in New Haven to juvenile delinquency, in which the earlier records of 317 
families who had lived for a few years in one of the Developments of the 
New Haven Housing Authority were compared to their records during these 
few years. Some of the difficulties in the interpretation of such data are 
obvious to anyone experienced in the analysis of data and there is no need 
to discuss them. 

There appears to be some confusion in the statistical literature on the 
differences between experimental designs and survey designs. Both involve 
patterns by which observations are taken, but in experimental designs the 
objective is to estimate the effects of applied treatments and in survey work 
the objective is to estimate certain characteristics for a particular defined 
population. There are possibilities of using experimental design patterns in 
survey work, as for example the use of deep stratification with a Latin 
Square. Again, if the purpose of the survey is to establish the existence of 
associations in the population, a sample that is balanced on some factors may 
be useful. Causal relationships are frequently inferred from such associations, 
but they cannot be regarded as having the sound scientific basis of inferences 
drawn from experimental data. 

Apart from the above objections which are in themselves sufficient to 
make the book unreliable if not dangerous for workers in social sciences, 
there are very many instances of imprecise thinking and writing. Two ex- 
amples are sufficient evidence: “The conventional tests of the significance of 
sampling are based upon the theory of random samples, and in the present 
stage of experimental work, as we have attempted to show, it is the terminal 
homogeneity and purity in the sample, rather than initial representativeness 
of heterogeneity, that is important in demonstrating the real relationship 
between the two variable factors” (p. 104), and “Furthermore, since the 
critical ratio of 2.90 is computed from a difference between extremely pure 
samples, its significance is much enhanced by this fact” (p. 107). (The “criti- 
cal ratio” appears to be Student’s “t”.) 

The author in spite of numerous references and discussions of the question 
does not appear to understand the concepts of prediction and causation. 
Mathematical economists have for some time differentiated between the two 
problems of predicting events and estimating structural relationships. The 
interpretation of correlation coefficients appears to the reviewer to be par- 
ticularly vague. 

Further general criticisms of the book are that the author takes as the 
fundamental rule of the experimental method the varying of only one condi- 
tion at a time with all other factors maintained constant (p. 1), and there is 
lack of clear discussion of the formulation of hypotheses in an unambiguous 
way before looking at the experimental data, this formulation of course in- 
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cluding the definition of some population for which the conclusions are to be 
relevant. 


Principles of Medical Statistics, Fourth Edition. A. Bradford Hill (Professor of 
Medical Statistics and Director of the Department, London School of Hygiene 
and Tropical Medicine, University of London, London, England). London 
W.C.2: Lancet Ltd. (7 Adams St., Adelphia), 1948. Pp. xi, 252. 10s. 6d. 


REVIEW BY J. YERUSHALMY 
Professor of Biostatistics, University of California 
Berkeley, California 


M‘“s workers in clinical and preventive medicine have found the previous 
editions of this book the most satisfactory source for acquiring a work- 
ing knowledge in statistics and an understanding of the role of quantitative 
methods in medicine and public health. The publication of the fourth edition 
is evidence at once that the demand for knowledge in this field is present in 
an increasing number of members of the medical professions and that this 
book is singularly effective in satisfying this demand. 

Of greatest interest to professional statisticians are the techniques of 
presentation which accomplish the difficult task of making statistical meth- 
ods palatable to a nonmathematical audience. These techniques, in the 
opinion of this reviewer, derive their effectiveness from a sound evaluation 
' by the author of the intellectual level of the readers for whom the book is 
intended. It is apparent that Professor Hill addresses himself to an audience 
of intelligent and capable persons in their own fields of endeavor, even if they 
are lacking in mathematical training. Consequently he does not “talk down” 
to them; rather he introduces the statistical concepts in the framework of 
medicine and public health with which his readers are familiar. He thus 
succeeds in developing the statistical techniques without the use of mathe- 
matical symbols and formulae in such a way that the rationale behind them 
is comprehended. The concepts become acceptable as logical entities although 
no mathematical derivations are presented. 

One minor exception to the generally lucid and comprehensive method of 
presentation may be mentioned. In discussing the standard error of the 
difference between two sample proportions Professor Hill presents the usual 
two formulae, one in which an estimate of the universe proportion is ob- 
tained from a combination of the two samples, the other in which individual 
proportions of the two samples are used in the determination of the standard 
error of the difference. The explanation of the later technique, however, is 
presented in terms of two different estimates of the standard error of the 
proportion in the universe (pp. 111-112). It would probably be more easily 
acceptable to the reader if the hypothesis of a single universe was not retained 
in the second approach. 
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In this fourth edition Professor Hill has substantially elaborated and 
strengthened many of the weaker points of the previous editions. Tabular 
and graphic presentation and the description and calculations of averages 
is more adequately covered. Sampling is discussed more adequately and the 
concept of normality is presented more clearly. The development of the life 
table and its functions is particularly lucid. The chapters on sampling, aver- 
ages, proportions, differences, and chi square are enriched by numerous 
examples, many of them in the field of public health. There are also additions 
in the number of the very instructive examples of fallacies, difficulties, and 
problems of selection. 

In spite of the additions and elaborations of so many topics, Professor Hill 
succeeded nevertheless in his objective to have the book “kept within 
limited bounds and confined to ‘arithmetic guided by logic’.” 


The Teaching of Statistics: A Report of the Institute of Mathematical Statistics 
Committee on the Teaching of Statistics. Harold Hotelling, Chairman (Professor 
of Mathematical Statistics and Head of the Department; Associate Director, 
Institute of Statistics; University of, North Carolina), Walter Bartky (Dean of the 
Division of the Physical Sciences and Professor of Applied Mathematics, Univer- 
sity of Chicago, Chicago, Ill.), W. Edwards Deming (Adviser in Sampling, Bureau 
of the Budget, Washington, D. C.), Milton Friedman (Associate Professor of 
Economics, University of Chicago, Chicago, Ill.), and Paul Hoel (Associate Pro- 
fessor of Mathematics, University of California, Los Angeles, Calif.). Reprinted 
from Ann Math Stat 19(1):95-115 Je ’47. Washington 25, D. C.: National Re- 
search Council (2101 Constitution Ave.), 1948. Pp. 95-115. Paper. 


The Teaching of Statistics in Universities and Colleges. Based upon a report 
by the Royal Statistical Society’s Committee on the teaching of statistics: E. S. 
Pearson (Chairman), R. G. D. Allen, H. Campion, William Elderton, C. Oswald 
George, R. F. George, M. Greenwood, D. Heron, J. O. Irwin, H. Leak, E. C. 
Rhodes, R. Stone, L. H. C. Tippett,andJ. Wishart. Preprint of J Royal Stat Soc 
110(1): 51-7 pt 1 ’47. London W.C.2: Royal Statistical Society (4 Portugal St.), 
1947. Pp. 7. Paper. 6d. 


REVIEWED BY TRUMAN L. KELLEY 
Professor of Education, Harvard University 
Cambridge, Massachusetts 


8 THESE reports cover the same ground, it is advantageous to compare 
their points of view and recommendations. The British report is the 
more realistic in that it starts with the status quo in the United Kingdom 
and Eire, where the teaching of statistics is not nearly as extensive as in the 
United States, and makes recommendations which may, perhaps, be at- 
tainable under the conflicting academic pressures. The American Report is 
idealistic. It states with conviction and well-documented argument what 
ought to be the situation, but it pays scant attention to the sundry historical 
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origins of statistical development and teaching, or to the multitudinous 
reasons for specialization and differentiation. 

The British report states that “There are few more urgent needs today 
than the increase of the supply of first class statisticians,” and the American 
report repeatedly expresses the same idea. 

The British committee asserts that social science workers require a knowl- 
edge of “descriptive statistics.” There is no specific reference to this initial 
and elementary need in the American report. This matter is of importance, 
for the American committee believes that a knowledge of basic statistical 
principles is a part of a liberal education which all college students should 
get. This basic need is acquaintance “with the broad principles of inductive 
inference.” It seems to the reviewer that there is implicit in the British recom- 
mendation an induction of the student into statistics via the subject matter 
of his field of specialization, and in the American an induction via logic, 
including principles of mathematics and probability. It is needless to say 
that these approaches are far asunder. By the one approach the student 
is initially aware of virginal data and with later study acquires the power of 
correctly appraising it, while the other approach would see that the student 
first acquires the powers and the tools for sound appraisal and then applies 
them to the field of his choice. 

This second approach seems necessary if all students, irrespective of their 
variety of interests, become acquainted with statistics through a general 
and comprehensive orientation course. Bota committees favor the inclusion 
of statistics (of a sort not clearly defined by the British committee) as a 
minor part of a universal basic course. 

Both committees believe, as stated by the British committee, that: the 
statistical approach is so fundamental to the modern way of looking at 
things . . . that it should form a part of the mental equipment of the edu- 
cated man...” This committee adds, with refreshing realism and modesty, 
that “... we do not propose at present that the universities should attempt 
to fill this gap in general education . . . ” However, both committees attack 
the problem, for they propose statistical courses, and the American commit- 
tee goes into detail in recommending the type of instructor that is needed. 

At the lowest level the British committee advocates a general course 
to familiarize students with statistical ideas and the most commonly used 
elementary methods, and the American committee asserts that “the funda- 
mental logic and philosophy of statistics can be taught at an early stage.” 
This is to include the broad principles of inductive inference, concepts of 
sampling variation, randomness, and statistical predictability, the difference 
between inductive and deductive statements, the nature of statistical esti- 
mation, and the nature of statistical hypotliesis. As a teacher who has tried 
a variety of approaches, the reviewer recommends one which is not clearly 
within the scope of either of these committee recommendations—it is via 
data already within the field of interest and acquaintance of the student. 
The phenomena of social and scientific experience constitute the starting 
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point. Their need for interpretation through sound logic is so obvious and so 
personally felt that there is no sense of unreality and no lack in motivation. 
Because of the variety of interest, a single introductory course does not suffice 
so, contingent upon sufficient registrants, there may well be introductory 
courses in statistics in biology, agriculture, economics, education and psy- 
chology, physical sciences, etc. Such an approach is vigorously castigated by 
the American committee, which advocates an introductory course under the 
aegis of the department of mathematics (expanded, to be sure, to include an 
expert statistician) and which even goes so far as to imply that the present 
situation wherein many departmental introductory courses are given by 
semi-trained statisticians is worse than if no courses at all were offered. 

Both committees admit the paucity of adequately trained teacher person- 
nel. In the light of it we must, as a practical matter, consider the benefits 
derivable from a course given, say, in the college department of economics 
by (a) a well-trained mathematician and statistician who is quite ignorant of 
economic issues, (b) a well-trained economist who is quite ignorant of statis- 
tical niceties, or (c) one semitrained in both statistics and in economics. The 
reviewer observes that situations change and that with national standards 
in the various disciplines set by such men as Fisher, Hotelling, Pearson, etc. 
a second giving of a course by an (a), a (b), or a (c) will migrate from a first 
giving in a commendable direction. We can say that two decades of American 
procedure that has been contrary to that endorsed by the American com- 
mittee has resulted in a truly remarkable advance in experimental and ap- 
plied statistics. The British committee implies what the decades have 
proven, that high competence in statistics has not been dependent upon 
college courses in statistics. Such courses certainly should broaden the base 
from which individual genius can rise to an apex, but let us be wary of a 
definition of path which attempts to fix the initial steps or the terminal 
product. 

The reviewer gathers that the American committee pictures statistical 
training through the following sequence: (1) a first course, or part of a course, 
for all students, given by a mathematical statistician; (2) a course for “fut- 
ure consumers” given by an applied statistician; (3) a course for “future users 
of statistical methods” which is in part mathematical statistics and in part 
applied statistics; and (4) a course for “future research workers and teach- 
ers...,” given under the original parental roof of mathematical statistics. 
The British recommendations are not so specific, but they write, whether 
with approval or regret is not clear, that “it seems inevitable that there 
should be separate courses for . . . students specializing in various sciences.” 
They also say that “the necessity, at this level, of supplementing lectures by 
practical work cannot be too strongly emphasized.” The necessity for a 
number of centers for advanced courses is emphasized. The sort of advanced 
course here implied seems quite similar to number (4) of the American com- 
mittee scheme, for they write “the basis of statistical theory and its ally the 
theory of probability is essentially mathematical.” 
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The American committee, by omission and by inclusion, reveals what it 
considers to be preparatory background for students of statistics. It at no 
point cites knowledge of data in some scientific field as essential. It urges 
knowledge of the calculus—making the happy suggestion that it is appro- 
priate for high school study—but fails to emphasize matrix and other ad- 
vanced algebra which are much more important for statistics. In the judg- 
ment of the reviewer, its greatest failure is in not building upon the usual 
normal psychological course of student development—(a) knowledge of 
general affairs, including a smattering of mathematics, (b) a developing 
interest in some scientific field, (c) an epochal experience when data are dis- 
covered to be an original source of new knowledge, (d) a determination to 
utilize this source and to acquaint oneself with the techniques for making it 
reveal its hidden information. Not until this last is tapped can we expect 
that men will devote themselves to the training necessary to become great 
statisticians and experimentalists. The reviewer believes that the end prod- 
duct in the teaching of statistics should transcend, if not in height certainly 
in breadth, that implied in the term “mathematical statistician.” 

The American committee deplores the general lack of mathematical com- 
petence of most teachers of statistics in different subject matter fields. This 
is deplorable as is their lack of knowledge of the genius of data in their fields. 
However, the progress of recent decades should make one optimistic, and 
these two committee reports should encourage college presidents to strengthen 
and broaden the instruction in both mathematical and applied statistics. 


Nomography. A. S. Levens (Associate Professor of Mechanical Engineering, 
University of California, Berkeley, Calif.) New York 16: John Wiley & Sons, Inc. 
(440 Fourth Ave.), 1948. Pp. viii, 176. $3.00. 


REVIEW By M. Kac 
Associate Professor of Mathematics, Cornell University 
Ithaca, New York 


_ book is devoted to an elementary presentation of the basic principles 
used in design of nomograms (alignment charts). The material is stand- 
ard but attractively presented and illustrated by numerous examples. 

The mathematical prerequisites are modest and consist of no more than 
elementary algebra (through logarithms) and plane geometry. 

The book is an outgrowth of nomography courses for engineering students 
and it is therefore only natural to find most of the applications and illustra- 
tions drawn from that field. There is no doubt that nomograms are useful as 
a computational aid in many engineering problems. The reviewer doubts 
however, that nomography will prove very useful to statisticians. In fact, 
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out of thirty-odd nomograms collected in the Appendix and which in the 
words of the author “may prove useful in the fields of engineering, produc- 
tion, business, and statistics” there are only two taken from the field of 
statistics. These are a nomogram for the standard deviation (p. 170). 


4/2 
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Oy 


and a nomogram for the correlation coefficient (p. 171) 


rxry 
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Both of these nomograms can be used only after Yx?, Dy?, Zxy have been 
calculated. Since the computation of 22?, Dy’, Lry is by far the most labo- 
rious part, it seems a little naive to expect that the above nomograms will 
introduce a significant saving of time. 


Foundations of Economic Analysis. Paul Anthony Samuelson (Professor of Eco- 
nomics, Massachusetts Institute of Technology, Cambridge, Mass.). Harvard 
Economic Studies, Vol. 80. Cambridge 38, Mass.: Harvard University Press 
(38 Quincy Street), 1947. Pp. xii, 447. $7.50. 


REVIEW BY GERHARD TINTNER 
Research Associate, Department of Applied Economics 
Cambridge University, Cambridge, England 
ON LEAVE: Professor of Economics and Mathematics 
Iowa State College, Ames, Iowa 


HIS is a very important contribution in the field of pure economics which 
‘Leads to be of interest to econometricians and those statisticians who con- 
cern themselves with economic matters. 

The author endeavors throughout to derive “operationally meaningful” 
propositions in economics from the assumption of maximizing behavior. 
Operationally meaningful statements are hypotheses about empirical data 
which could conceivably be refuted, if only under ideal conditions. These 
ideas are related to those of some modern positivist philosophers, especially 
Bridgeman. It is perhaps unfortunate that the author has not included a 
methodological chapter in his book. An explicit statement of his philosophical 
ideas may have contributed to the understanding of his book. It may have 
helped some economists who are sceptical about the applicability of radical 
positivism and behaviorism in the social sciences. 
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The first part of the book is concerned with economic statics. It also in- 
cludes a chapter on welfare economics, which seems a little out of place. 
The theory of maxima in all its ramifications is treated at great length. There 
is also a more systematic mathematical appendix on this subject. 

On the basis of a very general approach the theory of cost and production 
and also the theory of consumer behavior is ably stated. An interesting 
feature in the treatment is the discussion of maxima if the functions in ques- 
tion are not continuous. This is a very important advance, though the dis- 
continuities considered may not be the most important ones met with in 
economic life. 

Monopoly and monopolistic competition get very little attention, to say 
nothing of more complicated situations like duopoly and bilateral monopoly. 
This narrowness of the point of view makes the book a little less useful than 
it might have been. 

A short section on the economic theory of index numbers (pp. 146-156) 
should be of particular interest to the statistician. It is probably the best 
short statement available about the modern theory of cost of living index 
numbers. 

A very interesting chapter on welfare economics concludes this first section 
of the book. In its brief compass it contains most of the important new results 
of the modern theory in this field. 

The second part of the book deals with economic dynamics. But the dis- 
cussion is almost entirely based upon purely formal consideration of equilib- 
ria and their relation to comparative statics. The “correspondence” principle 
of the author states that “the problem of stability of equilibrium is inti- 
mately tied up with the problem of deriving fruitful theorems in comparative 
statics” (p. 258). The author succeeds in reaching very interesting conclu- 
sions in this way. His work is inspired by the modern theory of dynamics (in 
physics), especially by the work of George D. Birkhoff (Dynamical Systems, 
1927). A mathematical appendix deals with the theory of difference and 
other functional equations. The examples given in this section of the book 
include also a very interesting “dynamization” of the Keynesian system. 
But it appears here and elsewhere that not many definite conclusions can be 
drawn, even after making a number of somewhat arbitrary assumptions. 

The second part of the book is hence somewhat disappointing to the econ- 
omist. It is evident that not many valid and interesting conclusions can be 
drawn from such narrow assumptions as made by the author. The whole 
theory of anticipations and expectations is either neglected or appears only 
incidentally. Uncertainty hardly gets an adequate treatment in the small 
space allotted to it. Samuelson’s book is in this respect inferior to the earlier 
work of J. R. Hicks (Value and Capital, 1939) and others who have en- 
deavored to deal with problems of economic risk and uncertainty. I believe 
that especially the theory of formation of anticipations forms a true link 
between static economics and the more useful and interesting economic 
dynamics. 
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Chapter 11 deals with the classification of equilibria. This discussion is 
again based upon modern (physical) dynamics. The usefulness of the dis- 
tinctions made in economics is not immediately apparent. The subsequent 
discussion of the business cycle is very short and not on the high level main- 
tained in the earlier parts of the book. 

The author devotes towards the end of the book a few pages to the theory 
of stochastic systems. The treatment is much too brief to yield important 
results. Some of the ideas presented may however prove stimulating to the 
statisticians working in this field, especially the indications about nonlinear 
stochastic systems. 

In summary, the book ought to be recommended very highly to the mature 
reader who possesses an adequate knowledge of modern economic theory and 
higher mathematics. A conclusion which he may draw from its contents is 
the following: Assuming only maximizing behavior (something like the 
“economic principle” of the elementary text books) very few interesting 
conclusions about economic behavior can be drawn even by so brilliant a 
theoretician as Samuelson. There is a great deal of valid and important 
economic theory contained in his book. The reader might be inspired by the 
comparative failure of pure economics to devote his interest to a more prom- 
ising approach. Econcmetric investigations combine modern economic 
theory, which is so ably presented in this book, with modern statistical 
methods in order to derive valid empirical conclusions. 
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